Polynucleotides and polypeptides in plants

ABSTRACT

The invention relates to plant transcription factor polypeptides, polynucleotides that encode them, homologs from a variety of plant species, and methods of using the polynucleotides and polypeptides to produce transgenic plants having advantageous properties compared to a reference plant. Sequence information related to these polynucleotides and polypeptides can also be used in bioinformatic search methods and is also disclosed.

RELATIONSHIP TO COPENDING APPLICATIONS

This application claims the benefit of U.S. Non-provisional applicationSer. No. 09/394,519, filed Sep. 13, 1999, U.S. Non-provisionalapplication Ser. No. 09/489,376, filed Jan. 21, 2000, U.S.Non-provisional application Ser. No. 09/506,720, filed Feb. 17, 2000,U.S. Non-provisional application Ser. No. 09/533,030, filed Mar. 22,2000, U.S. Non-provisional application Ser. No. 09/533,392, filed Mar.22, 2000, U.S. Non-provisional application Ser. No. 09/533,029, filedMar. 22, 2000, U.S. Non-provisional application Ser. No. 09/532,591,filed Mar. 22, 2000, U.S. Non-provisional application Ser. No.09/533,648, filed Mar. 22, 2000, U.S. Non-provisional application Ser.No. 09/958,131, filed Jan. 30, 2002, PCT Application No. PCT/US00/09448,filed Apr. 6, 2000, U.S. Non-provisional application Ser. No.09/713,994, filed Nov. 16, 2000, U.S. Non-provisional application Ser.No. 09/819,142, filed Mar. 27, 2001, U.S. Non-provisional applicationSer. No. 09/837,444, filed Apr. 18, 2001, U.S. Provisional ApplicationNo. 60/310,847, filed Aug. 9, 2001, U.S. Provisional Application No.60/336,049, filed Nov. 19, 2001, U.S. Provisional Application No.60/338,692, filed Dec. 11, 2001, U.S. Non-provisional application Ser.No. 10/171,468, filed Jun. 14, 2002, U.S. Non-provisional applicationSer. No. 10/225,066, filed Aug. 9, 2002, U.S. Non-provisionalapplication Ser. No. 10/225,067, filed Aug. 9, 2002, U.S.Non-provisional application Ser. No. 10/225,068, filed Aug. 9, 2002,U.S. Provisional Application No. 60/434,166, filed Dec. 17, 2002, andU.S. Non-provisional application Ser. No. 10/374,780, filed Feb. 25,2003, the entire contents of which are hereby incorporated by reference.

TECHNICAL FIELD

This invention relates to the field of plant biology. More particularly,the present invention pertains to compositions and methods forphenotypically modifying a plant.

BACKGROUND OF THE INVENTION

A plant's traits, such as its biochemical, developmental, or phenotypiccharacteristics, may be controlled through a number of cellularprocesses. One important way to manipulate that control is throughtranscription factors—proteins that influence the expression of aparticular gene or sets of genes. Transformed and transgenic plants thatcomprise cells having altered levels of at least one selectedtranscription factor, for example, possess advantageous or desirabletraits. Strategies for manipulating traits by altering a plant cell'stranscription factor content can therefore result in plants and cropswith new and/or improved commercially valuable properties.

Transcription factors can modulate gene expression, either increasing ordecreasing (inducing or repressing) the rate of transcription. Thismodulation results in differential levels of gene expression at variousdevelopmental stages, in different tissues and cell types, and inresponse to different exogenous (e.g., environmental) and endogenousstimuli throughout the life cycle of the organism.

Because transcription factors are key controlling elements of biologicalpathways, altering the expression levels of one or more transcriptionfactors can change entire biological pathways in an organism. Forexample, manipulation of the levels of selected transcription factorsmay result in increased expression of economically useful proteins orbiomolecules in plants or improvement in other agriculturally relevantcharacteristics. Conversely, blocked or reduced expression of atranscription factor may reduce biosynthesis of unwanted compounds orremove an undesirable trait. Therefore, manipulating transcriptionfactor levels in a plant offers tremendous potential in agriculturalbiotechnology for modifying a plant's traits. A number of theagriculturally relevant characteristics of plants, and desirable traitsthat may be imbued by modified transcription factor gene expression, arelisted below.

Useful Plant Traits

Category: Abiotic Stress; Desired Trait: Chilling Tolerance

The term “chilling sensitivity” has been used to describe many types ofphysiological damage produced at low, but above freezing, temperatures.Most crops of tropical origins such as soybean, rice, maize and cottonare easily damaged by chilling. Typical chilling damage includeswilting, necrosis, chlorosis or leakage of ions from cell membranes. Theunderlying mechanisms of chilling sensitivity are not completelyunderstood yet, but probably involve the level of membrane saturationand other physiological deficiencies. For example, photoinhibition ofphotosynthesis (disruption of photosynthesis due to high lightintensities) often occurs under clear atmospheric conditions subsequentto cold late summer/autumn nights. By some estimates, chilling accountsfor monetary losses in the United States (US) second only to drought andflooding. For example, chilling may lead to yield losses and lowerproduct quality through the delayed ripening of maize. Anotherconsequence of poor growth is the rather poor ground cover of maizefields in spring, often resulting in soil erosion, increased occurrenceof weeds, and reduced uptake of nutrients. A retarded uptake of mineralnitrogen could also lead to increased losses of nitrate into the groundwater.

Category: Abiotic Stress; Desired Trait: Freezing Tolerance.

Freezing is a major environmental stress that limits where crops can begrown and that reduces yields considerably, depending on the weather ina particular growing season. In addition to exceptionally stressfulyears that cause measurable losses of billions of dollars, less extremestress almost certainly causes smaller yield reductions over largerareas to produce yield reductions of similar dollar value every year.For instance, in the US, the 1995 early fall frosts are estimated tohave caused losses of over one billion dollars to corn and soybeans. Thespring of 1998 saw an estimated $200 M of damages to Georgia alone, inthe peach, blueberry and strawberry industries. The occasional freezesin Florida have shifted the citrus belt further south due to $100 M ormore losses. California sustained $650 M of damage in 1998 to the citruscrop due to a winter freeze. In addition, certain crops such asEucalyptus, which has the very favorable properties of rapid growth andgood wood quality for pulping, are not able to grow in the southeasternstates due to occasional freezes.

Inherent winter hardiness of the crop determines in which agriculturalareas it can survive the winter. For example, for wheat, the northerncentral portion of the US has winters that are too cold for good winterwheat crops. Approximately 20% of the US wheat crop is spring wheat,with a market value of $2 billion. Areas growing spring wheat couldbenefit by growing winter wheat that had increased winter hardiness.Assuming a 25% yield increase when growing winter wheat, this wouldcreate $500 M of increased value. Additionally, the existing winterwheat is severely stressed by freezing conditions and should haveimproved yields with increased tolerance to these stresses. An estimateof the yield benefit of these traits is 10% of the $4.4 billion winterwheat crop in the US or $444 M of yield increase, as well as bettersurvival in extreme freezing conditions that occur periodically.

Thus plants more resistant to freezing, both midwinter freezing andsudden freezes, would protect a farmers' investment, improve yield andquality, and allow some geographies to grow more profitable andproductive crops. Additionally, winter crops such as canola, wheat andbarley have 25% to 50% yield increases relative to spring plantedvarieties of the same crops. This yield increase is due to the “headstart” the fall planted crop has over the spring planted crop and itsreaching maturity earlier while the temperatures, soil moisture and lackof pathogens provide more favorable conditions.

Category: Abiotic Stress; Desired Trait: Salt Tolerance.

One in five hectares of irrigated land is damaged by salt, an importanthistorical factor in the decline of ancient agrarian societies. Thiscondition is only expected to worsen, further reducing the availabilityof arable land and crop production, since none of the top five foodcrops—wheat, corn, rice, potatoes, and soybean—can tolerate excessivesalt.

Detrimental effects of salt on plants are a consequence of both waterdeficit resulting in osmotic stress (similar to drought stress) and theeffects of excess sodium ions on critical biochemical processes. As withfreezing and drought, high saline causes water deficit; the presence ofhigh salt makes it difficult for plant roots to extract water from theirenvironment (Buchanan et al. (2000) in Biochemistry and MolecularBiology of Plants, American Society of Plant Physiologists, Rockville,Md.). Soil salinity is thus one of the more important variables thatdetermines where a plant may thrive. In many parts of the world, sizableland areas are uncultivable due to naturally high soil salinity. Tocompound the problem, salination of soils that are used for agriculturalproduction is a significant and increasing problem in regions that relyheavily on agriculture. The latter is compounded by over-utilization,over-fertilization and water shortage, typically caused by climaticchange and the demands of increasing population. Salt tolerance is ofparticular importance early in a plant's lifecycle, since evaporationfrom the soil surface causes upward water movement, and salt accumulatesin the upper soil layer where the seeds are placed. Thus, germinationnormally takes place at a salt concentration much higher than the meansalt level in the whole soil profile.

Category: Abiotic Stress; Desired Trait: Drought Tolerance.

While much of the weather that we experience is brief and short-lived,drought is a more gradual phenomenon, slowly taking hold of an area andtightening its grip with time. In severe cases, drought can last formany years, and can have devastating effects on agriculture and watersupplies. With burgeoning population and chronic shortage of availablefresh water, drought is not only the number one weather related problemin agriculture, it also ranks as one of the major natural disasters ofall time, causing not only economic damage, but also loss of humanlives. For example, losses from the US drought of 1988 exceeded $40billion, exceeding the losses caused by Hurricane Andrew in 1992, theMississippi River floods of 1993, and the San Francisco earthquake in1989. In some areas of the world, the effects of drought can be far moresevere. In the Horn of Africa the 1984-1985 drought led to a famine thatkilled 750,000 people.

Problems for plants caused by low water availability include mechanicalstresses caused by the withdrawal of cellular water. Drought also causesplants to become more susceptible to various diseases (Simpson (1981).“The Value of Physiological Knowledge of Water Stress in Plants”, InWater Stress on Plants, (Simpson, G. M., ed.), Praeger, N.Y., pp.235-265).

In addition to the many land regions of the world that are too arid formost if not all crop plants, overuse and over-utilization of availablewater is resulting in an increasing loss of agriculturally-usable land,a process which, in the extreme, results in desertification. The problemis further compounded by increasing salt accumulation in soils, asdescribed above, which adds to the loss of available water in soils.

Category: Abiotic Stress; Desired Trait: Heat Tolerance.

Germination of many crops is very sensitive to temperature. Atranscription factor that would enhance germination in hot conditionswould be useful for crops that are planted late in the season or in hotclimates.

Seedlings and mature plants that are exposed to excess heat mayexperience heat shock, which may arise in various organs, includingleaves and particularly fruit, when transpiration is insufficient toovercome heat stress. Heat also damages cellular structures, includingorganelles and cytoskeleton, and impairs membrane function (Buchanan,supra).

Heat shock may result a decrease in overall protein synthesis,accompanied by expression of heat shock proteins. Heat shock proteinsfunction as chaperones and are involved in refolding proteins denaturedby heat.

Category: Abiotic Stress; Desired Trait: Tolerance to Low Nitrogen andPhosphorus.

The ability of all plants to remove nutrients from their environment isessential to survival. Thus, identification of genes that encodepolypeptides with transcription factor activity may allow for thegeneration of transgenic plants that are better able to make use ofavailable nutrients in nutrient-poor environments.

Among the most important macronutrients for plant growth that have thelargest impact on crop yield are nitrogenous and phosphorus-containingcompounds. Nitrogen- and phosphorus containing fertilizers are usedintensively in agriculture practices today. An increase in grain cropyields from 0.5 to 1.0 metric tons per hectare to 7 metric tons perhectare accompanied the use of commercial fixed nitrogen fertilizer inproduction farming (Vance (2001) Plant Physiol. 127: 390-397). Givencurrent practices, in order to meet food production demands in years tocome, considerable increases in the amount of nitrogen- andphosphorus-containing fertilizers will be required (Vance, supra).

Nitrogen is the most abundant element on earth yet it is one of the mostlimiting elements to plant growth due to its lack of availability in thesoil. Plants obtain N from the soil from several sources includingcommercial fertilizers, manure and the mineralization of organic matter.The intensive use of N fertilizers in present agricultural practices isproblematic, the energy intensive Haber-Bosch process makes N fertilizerand it is estimated that the US uses annually between 3-5% of thenation's natural gas for this process. In addition to the expense of Nfertilizer production and the depletion of non-renewable resources, theuse of N fertilizers has led to the eutrophication of freshwaterecosystems and the contamination of drinking water due to the runoff ofexcess fertilizer into ground water supplies.

Phosphorus is second only to N in its importance as a macronutrient forplant growth and to its impact on crop yield. Phosphorus (P) isextremely immobile and not readily available to roots in the soil and istherefore often growth limiting to plants. Inorganic phosphate (Pi) is aconstituent of several important molecules required for energy transfer,metabolic regulation and protein activation (Marschner (1995) MineralNutrition of Higher Plants, 2nd ed., Academic Press, San Diego, Calif.).Plants have evolved several strategies to help cope with P and Ndeprivation that include metabolic as well as developmental adaptations.Most, if not all, of these strategies have components that are regulatedat the level of transcription and therefore are amenable to manipulationby transcription factors. Metabolic adaptations include increasing theavailability of P and N by increasing uptake from the soil though theinduction of high affinity and low affinity transporters, and/orincreasing its mobilization in the plant. Developmental adaptationsinclude increases in primary and secondary roots, increases in root hairnumber and length, and associations with mycorrhizal fungi (Bates andLynch (1996) Plant Cell Environ. 19: 529-538; Harrison (1999) Annu. Rev.Plant Physiol. Plant Mol. Biol. 50: 361-389).

Category: Biotic Stress; Desired Trait: Disease Resistance.

Disease management is a significant expense in crop productionworldwide. According to EPA reports for 1996 and 1997, us farmers spendapproximately $6 billion on fungicides annually. Despite thisexpenditure, according to a survey conducted by the food and agricultureorganization, plant diseases still reduce worldwide crop productivity by12% and in the United States alone, economic losses due to plantpathogens amounts to 9.1 billion dollars (FAO, 1993). Data from thesereports and others demonstrate that despite the availability of chemicalcontrol only a small proportion of the losses due to disease can beprevented. Not only are fungicides and anti-bacterial treatmentsexpensive to growers, but their widespread application poses bothenvironmental and health risks. The use of plant biotechnology toengineer disease resistant crops has the potential to make a significanteconomic impact on agriculture and forestry industries in two ways:reducing the monetary and environmental expense of fungicide applicationand reducing both pre-harvest and post-harvest crop losses that occurnow despite the use of costly disease management practices.

Fungal, bacterial, oomycete, viral, and nematode diseases of plants areubiquitous and important problems, and often severely impact yield andquality of crop and other plants. A very few examples of diseases ofplants include:

-   -   Powdery mildew, caused by the fungi Erysiphe, Sphaerotheca,        Phyllactinia, Microsphaera, Podosphaera or Uncinula, in, for        example, wheat, bean, cucurbit, lettuce, pea, grape, tree fruit        crops, as well as roses, phlox, lilacs, grasses, and Euonymus;    -   Fusarium-caused diseases such as Fusarium wilt in cucurbits,        Fusarium head blight in barley and wheat, wilt and crown and        root rot in tomatoes;    -   Sudden oak death, caused by the oomycete Phytophthora ramorum;        this disease was first detected in 1995 in California tan oaks.        The disease has since killed more than 100,000 tan oaks, coast        live oaks, black oaks, and Shreve's oaks in coastal regions of        northern California, and more recently in southwestern Oregon        (Roach (2001) National Geographic News, Dec. 6, 2001);    -   Black Sigatoka, a fungal disease caused by Mycosphaerella        species that attacks banana foliage, is spreading throughout the        regions of the world that are responsible for producing most of        the world's banana crop;    -   Eutypa dieback, caused by Eutypa lata, affects a number of crop        plants, including vine grape. Eutypa dieback delays shoot        emergence, and causes chlorosis, stunting, and tattering of        leaves;    -   Pierce's disease, caused by the bacterium Xylella fastidiosa,        precludes growth of grapes in the southeastern United States,        and threatens the profitable wine grape industry in northern        California. The bacterium clogs the vasculature of the        grapevines, resulting in foliar scorching followed by slow death        of the vines. There is no known treatment for Pierce's disease;    -   Bacterial Spot caused by the bacterium Xanthomonas campestris        causes serious disease problems on tomatoes and peppers. It is a        significant problem in the Florida tomato industry because it        spreads rapidly, especially in warm periods where there is        wind-driven rain. Under these conditions, there are no adequate        control measures;    -   Diseases caused by viruses of the family Geminiviridae are a        growing agricultural problem worldwide. Geminiviruses have        caused severe crop losses in tomato, cassava, and cotton. For        instance, in the 1991-1992 growing season in Florida,        geminiviruses caused $140 million in damages to the tomato crop        (Moffat (1991) Science 286: 1835). Geminiviruses have the        ability to recombine between strains to rapidly produce new        virulent varieties. Therefore, there is a pressing need for        broad-spectrum geminivirus control;    -   The soybean cyst nematode, Heterodera glycines, causes stunting        and chlorosis of soybean plants, which results in yield losses        or plant death from severe infestation. Annual losses in the        United States have been estimated at $1.5 billion (University of        Minnesota Extension Service).

The aforementioned pathogens represent a very small fraction of diversespecies that seriously affect plant health and yield. For a morecomplete description of numerous plant diseases, see, for example,Vidhyasekaran (1997) Fungal Pathogenesis in Plants and Crops: MolecularBiology and Host Defense Mechanisms, Marcel Dekker, Monticello, N.Y.),or Agrios (1997) Plant Pathology, Academic Press, New York, N.Y.).Plants that are able to resist disease may produce significantly higheryields and improved food quality. It is thus of considerable importanceto find genes that reduce or prevent disease.

Category: Light Response; Desired Trait: Reduced Shade Avoidance.

Shade avoidance describes the process in which plants grown in closeproximity attempt to out-compete each other by increasing stem length atthe expense of leaf, fruit and storage organ development. This is causedby the plant's response to far-red radiation reflected from leaves ofneighboring plants, which is mediated by phytochrome photoreceptors.Close proximity to other plants, as is produced in high-density cropplantings, increases the relative proportion of far-red irradiation, andtherefore induces the shade avoidance response. Shade avoidanceadversely affects biomass and yield, particularly when leaves, fruits orother storage organs constitute the desired crop (see, for example,Smith (1982) Annu. Rev. Plant Physiol. 33: 481-518; Ballare et al.(1990) Science 247: 329-332; Smith (1995) Annu. Dev. Plant Physiol. Mol.Biol., 46: 289-315; and Schmitt et al. (1995), American Naturalist, 146:937-953). Alteration of the shade avoidance response in tobacco throughalteration of phytochrome levels has been shown to produce an increasein harvest index (leaf biomass/total biomass) at high planting density,which would result in higher yield (Robson et al. (1996) NatureBiotechnol. 14: 995-998).

Category: Flowering Time; Desired Trait: Altered Flowering Time andFlowering Control.

Timing of flowering has a significant impact on production ofagricultural products. For example, varieties with different floweringresponses to environmental cues are necessary to adapt crops todifferent production regions or systems. Such a range of varieties havebeen developed for many crops, including wheat, corn, soybean, andstrawberry. Improved methods for alteration of flowering time willfacilitate the development of new, geographically adapted varieties.

Breeding programs for the development of new varieties can be limited bythe seed-to-seed cycle. Thus, breeding new varieties of plants withmulti-year cycles (such as biennials, e.g. carrot, or fruit trees, suchas citrus) can be very slow. With respect to breeding programs, therewould be a significant advantage in having commercially valuable plantsthat exhibit controllable and modified periods to flowering (“floweringtimes”). For example, accelerated flowering would shorten crop and treebreeding programs.

Improved flowering control allows more than one planting and harvest ofa crop to be made within a single season. Early flowering would alsoimprove the time to harvest plants in which the flower portion of theplant constitutes the product (e.g., broccoli, cauliflower, and otheredible flowers). In addition, chemical control of flowering throughinduction or inhibition of flowering in plants could provide asignificant advantage to growers by inducing more uniform fruitproduction (e.g., in strawberry)

A sizable number of plants for which the vegetative portion of the plantforms the valuable crop tend to “bolt” dramatically (e.g., spinach,onions, lettuce), after which biomass production declines and productquality diminishes (e.g., through flowering-triggered senescence ofvegetative parts). Delay or prevention of flowering may also reduce orpreclude dissemination of pollen from transgenic plants.

Category: Growth Rate; Desired Trait: Modified Growth Rate.

For almost all commercial crops, it is desirable to use plants thatestablish more quickly, since seedlings and young plants areparticularly susceptible to stress conditions such as salinity ordisease. Since many weeds may outgrow young crops or out-compete themfor nutrients, it would also be desirable to determine means forallowing young crop plants to out compete weed species. Increasingseedling growth rate (emergence) contributes to seedling vigor andallows for crops to be planted earlier in the season with less concernfor losses due to environmental factors. Early planting helps add daysto the critical grain-filling period and increases yield.

Providing means to speed up or slow down plant growth would also bedesirable to ornamental horticulture. If such means be provided, slowgrowing plants may exhibit prolonged pollen-producing or fruitingperiod, thus improving fertilization or extending harvesting season.

Category: Growth Rate; Desired Trait: Modified Senescence and CellDeath.

Premature senescence, triggered by various plant stresses, can limitproduction of both leaf biomass and seed yield. Transcription factorgenes that suppress premature senescence or cell death in response tostresses can provide means for increasing yield. Delay of normaldevelopmental senescence could enhance yield also, particularly forthose plants for which the vegetative part of the plant represents thecommercial product (e.g., spinach, lettuce).

Although leaf senescence is thought to be an evolutionary adaptation torecycle nutrients, the ability to control senescence in an agriculturalsetting has significant value. For example, a delay in leaf senescencein some maize hybrids is associated with a significant increase inyields and a delay of a few days in the senescence of soybean plants canhave a large impact on yield. In an experimental setting, tobacco plantsengineered to inhibit leaf senescence had a longer photosyntheticlifespan, and produced a 50% increase in dry weight and seed yield (Ganand Amasino (1995) Science 270: 1986-1988). Delayed flower senescencemay generate plants that retain their blossoms longer and this may be ofpotential interest to the ornamental horticulture industry, and delayedfoliar and fruit senescence could improve post-harvest shelf-life ofproduce.

Further, programmed cell death plays a role in other plant responses,including the resistance response to disease, and some symptoms ofdiseases, for example, as caused by necrotrophic pathogens such asBotrytis cinerea and Sclerotinia sclerotiorum (Dickman et al. Proc.Natl. Acad. Sci., 98: 6957-6962). Localized senescence and/or cell deathcan be used by plants to contain the spread of harmful microorganisms. Aspecific localized cell death response, the “hypersensitive response”,is a component of race-specific disease resistance mediated by plantresistance genes. The hypersensitive response is thought to help limitpathogen growth and to initiate a signal transduction pathway that leadsto the induction of systemic plant defenses. Accelerated senescence maybe a defense against obligate pathogens, such as powdery mildew, thatrely on healthy plant tissue for nutrients. With regard to powderymildew, Botrytis cinerea and Sclerotinia sclerotiorum and otherpathogens, transcription factors that ameliorate cell death and/ordamage may reduce the significant economic losses encountered, such as,for example, Botrytis cinerea in strawberry and grape.

Category: Growth Regulator; Desired Trait: Altered Sugar Sensing

Sugars are key regulatory molecules that affect diverse processes inhigher plants including germination, growth, flowering, senescence,sugar metabolism and photosynthesis. Sucrose, for example, is the majortransport form of photosynthate and its flux through cells has beenshown to affect gene expression and alter storage compound accumulationin seeds (source-sink relationships). Glucose-specific hexose-sensinghas also been described in plants and is implicated in cell division andrepression of “famine” genes (photosynthetic or glyoxylate cycles).

Category: Morphology; Desired Trait: Altered Morphology

Trichomes are branched or unbranched epidermal outgrowths or hairstructures on a plant. Trichomes produce a variety of secondarybiochemicals such as diterpenes and waxes, the former being importantas, for example, insect pheromones, and the latter as protectantsagainst-desiccation and herbivorous pests. Since diterpenes also havecommercial value as flavors, aromas, pesticides and cosmetics, andpotential value as anti-tumor agents and inflammation-mediatingsubstances, they have been both products and the target of considerableresearch. In most cases where the metabolic pathways are impossible toengineer, increasing trichome density or size on leaves may be the onlyway to increase plant productivity. Thus, it would be advantageous todiscover trichome-affecting transcription factor genes for the purposeof increasing trichome density, size, or type to produce plants that arebetter protected from insects or that yield higher amounts of secondarymetabolites.

The ability to manipulate wax composition, amount, or distribution couldmodify plant tolerance to drought and low humidity or resistance toinsects, as well as plant appearance. In particular, a possibleapplication for a transcription factor gene that reduces wax productionin sunflower seed coats would be to reduce fouling during seed oilprocessing. Antisense or co-suppression of transcription factorsinvolved in wax biosynthesis in a tissue specific manner can be used tospecifically alter wax composition, amount, or distribution in thoseplants and crops from which wax is either a valuable attribute orproduct or an undesirable constituent of plants.

Other morphological characteristics that may be desirable in plantsinclude those of an ornamental nature. These include changes in seedcolor, overall color, leaf and flower shape, leaf color, leaf size, orglossiness of leaves. Plants that produce dark leaves may have benefitsfor human health; flavonoids, for example, have been used to inhibittumor growth, prevent of bone loss, and prevention lipid oxidation inanimals and humans. Plants in which leaf size is increased would likelyprovide greater biomass, which would be particularly valuable for cropsin which the vegetative portion of the plant constitutes the product.Plants with glossy leaves generally produce greater epidermal wax,which, if it could be augmented, resulted in a pleasing appearance formany ornamentals, help prevent desiccation, and resist herbivorousinsects and disease-causing agents. Changes in plant or plant partcoloration, brought about by modifying, for example, anthocyanin levels,would provide novel morphological features.

In many instances, the seeds of a plant constitute a valuable crop.These include, for example, the seeds of many legumes, nuts and grains.The discovery of means for producing larger seed would providesignificant value by bringing about an increase in crop yield.

Plants with altered inflorescence, including, for example, largerflowers or distinctive floral configurations, may have high value in theornamental horticulture industry.

Modifications to flower structure may have advantageous or deleteriouseffects on fertility, and could be used, for example, to decreasefertility by the absence, reduction or screening of reproductivecomponents. This could be a desirable trait, as it could be exploited toprevent or minimize the escape of the pollen of genetically modifiedorganisms into the environment.

Manipulation of inflorescence branching patterns may also be used toinfluence yield and offer the potential for more effective harvestingtechniques. For example, a “self pruning” mutation of tomato results ina determinate growth pattern and facilitates mechanical harvesting(Pnueli et al. (2001) Plant Cell 13(12): 2687-2702).

Alterations of apical dominance or plant architecture could create newplant varieties. Dwarf plants may be of potential interest to theornamental horticulture industry.

Category: Seed Biochemistry; Desired Trait: Altered Seed Oil

The composition of seeds, particularly with respect to seed oil quantityand/or composition, is very important for the nutritional value andproduction of various food and feed products. Desirable improvements tooils include enhanced heat stability, improved nutritional qualitythrough, for example, reducing the number of calories in seed,increasing the number of calories in animal feeds, or altering the ratioof saturated to unsaturated lipids comprising the oils.

Category: Seed Biochemistry; Desired Trait: Altered Seed Protein

As with seed oils, seed protein content and composition is veryimportant for the nutritional value and production of various food andfeed products. Altered protein content or concentration in seeds may beused to provide nutritional benefits, and may also prolong storagecapacity, increase seed pest or disease resistance, or modifygermination rates. Altered amino acid composition of seeds, throughaltered protein composition, is also a desired objective for nutritionalimprovement.

Category: Seed Biochemistry; Desired Trait: Altered Prenyl Lipids.

Prenyl lipids, including the tocopherols, play a role in anchoringproteins in membranes or membranous organelles. Tocopherols have bothanti-oxidant and vitamin E activity. Modified tocopherol composition ofplants may thus be useful in improving membrane integrity and function,which may mitigate abiotic stresses such as heat stress. Increasing theanti-oxidant and vitamin content of plants through increased tocopherolcontent can provide useful human health benefits.

Category: Leaf Biochemistry; Desired Trait: Altered Glucosinolate Levels

Increases or decreases in specific glucosinolates or total glucosinolatecontent can be desirable depending upon the particular application. Forexample: (i) glucosinolates are undesirable components of the oilseedsused in animal feed, since they produce toxic effects; low-glucosinolatevarieties of canola have been developed to combat this problem; (ii)some glucosinolates have anti-cancer activity; thus, increasing thelevels or composition of these compounds can be of use in production ofnutraceuticals; and (iii) glucosinolates form part of a plant's naturaldefense against insects; modification of glucosinolate composition orquantity could therefore afford increased protection from herbivores.Furthermore, tissue specific promoters can be used in edible crops toensure that these compounds accumulate specifically in particulartissues, such as the epidermis, which are not taken for humanconsumption.

Category: Leaf Biochemistry; Desired Trait: Flavonoid Production.

Expression of transcription factors that increase flavonoid productionin plants, including anthocyanins and condensed tannins, may be used toalter pigment production for horticultural purposes, and possibly toincrease stress resistance. Flavonoids have antimicrobial activity andcould be used to engineer pathogen resistance. Several flavonoidcompounds have human health promoting effects such as inhibition oftumor growth, prevention of bone loss and prevention of lipid oxidation.Increased levels of condensed tannins in forage legumes would provideagronomic benefits in ruminants by preventing pasture bloat bycollapsing protein foams within the rumen. For a review on the utilitiesof flavonoids and their derivatives, see Dixon et al. (1999) TrendsPlant Sci. 4: 394-400.

Genetic and molecular studies on Arabidopsis have revealed that thetiming of flowering is influenced by a large number of different genes(Martinez-Zapater and Somerville (1990) Plant Physiol. 92: 770-776;Koornneef et al. (1991) Mol. Gen. Genet. 229: 57-66; Martinez-Zapater etal. (1994) In Meyerowitz and Somerville, editors, Arabidopsis, ColdSpring Harbor Laboratory Press, Cold Spring Harbor, N.Y., pp 403-433;Koornneef et al. (1998a) Annu. Rev. Plant Physiol. Plant Mol. Biol. 49:345-370; Koornneef et al. (1998b) Genetics 148: 885-892; Levy and Dean(1998) Plant Cell 10: 1973-1990; Simpson et al. (1999) Annu. Rev. CellDev. Biol. 15: 519-550; Simpson and Dean (2002) Science 296: 285-289;and Ratcliffe and Riechmann (2002) Curr. Issues Mol. Biol. 4: 77-91).Such loci ensure that the switch from vegetative to reproductive growthtakes place at the most appropriate time with respect to a variety ofabiotic and biotic variables. Amongst the most intensively studiedeffects are the responses to day length and prolonged exposure to lowtemperatures (vernalization).

Arabidopsis flowers rapidly in long day photoperiodic conditions of 16hours or continuous light. However, under short day conditions of 8-10hours of light, the plants display a much more extensive period ofvegetative growth prior to flowering. Genes that control this day lengthresponse were originally identified via mutations that cause lateflowering under long days, but which do not alter flowering time inshort day conditions. Examples of photoperiod pathway genes includeCONSTANS (CO), GIGANTEA (GI), FE, FD, and FHA. A second group of genes,which includes LUMINIDEPENDENS (LD), FCA, FVE, FY, and FPA, form anautonomous pathway that monitors the developmental state of the plantand is active under all photoperiodic conditions. Mutants for thissecond class of genes flower later than wild type controls irrespectiveof the day length (Koornneef et al. (1991); Martinez-Zapater et al.(1994); Koornneef et al. (1998a); and Koornneef et al. (1998b); allsupra).

Importantly, mutants from the photoperiod and autonomous pathways alsoshow a differential response to vernalization. Via a vernalizationresponse, Arabidopsis ecotypes from northern latitudes, such asStockholm, Sweden, are adapted to flower in the spring followingexposure to cold winter conditions. This avoids flowering in the latesummer when seed maturation might be curtailed by the onset of winterconditions. (See, for example, Vince-Prue (1975) In Photoperiodism inplants McGraw Hill, London, UK, pp 263-291; Napp-Zinn (1957) Z. Indukt.Abstammungs Vererbungsl. 88: 253-285; and Reeves and Coupland (2000)Curr. Opin. Plant Biol. 3: 37-42).

When such ecotypes are grown in the laboratory they flower late, butwill flower much earlier if subjected to a cold period of 4-8 weekswhile the seed is germinating. In a comparable manner, mutants from theautonomous pathway exhibit a very marked reduction in flowering timewhen subjected to vernalization. By contrast, mutants from thephotoperiod pathway show only a minor response to cold treatments. Thus,vernalization can overcome the requirement for the autonomous pathway.(See Martinez-Zapater and Somerville (1990) supra; Koornneef et al.(1991) supra; Bagnall (1992) Aust. J. Plant Physiol. 19: 401-409; Burnet al. (1993) Proc. Natl. Acad. Sci. 90: 287-291; Lee et al. (1993) Mol.Gen. Genet. 237: 171-176; Clarke and Dean (1994) Mol. Gen. Genet. 242:81-89; Chandler et al. (1996) Plant J. 10: 637-644; Koornneef et al.(1998b) supra.)

Genetic and molecular analyses have revealed that a MADS box protein,FLOWERING LOCUS C (FLC), is a major determinant of the vernalizationresponse (Koornneef et al. (1994) supra; Lee et al. (1994) supra; Sandaand Amasino (1996) Mol. Gen. Genet. 251: 69-74; Michaels and Amasino(2000) Plant Cell and Environment 23: 1145-1153; Sheldon et al. (1999)Plant Cell 11: 445-458; Sheldon et al. (2002) Plant Cell 14: 2527-2537;and Rouse et al. (2002) Plant J. 29: 183-191). High levels of both FLCgene transcript and protein are present in mutants for the autonomouspathway and also in naturally late flowering northern ecotypes, whichcontain active alleles of a second locus, FRIGIDA (FR1; Burn et al.(1993) supra; Clarke and Dean (1994) supra; Johanson et al. (2000)Science 290: 344-347). By contrast, mutants from the photoperiodpathway, and backgrounds lacking an active FRI allele, show relativelylow levels of FLC transcript. Furthermore, null alleles of flccompletely suppress the late flowering caused by autonomous pathwaymutations and active FRI alleles, but have no effect on the delayedflowering in photoperiod pathway mutants (Michaels and Amasino (2001)Plant Cell 13: 935-941). FLC gene expression therefore appears to besupported by FRI and strongly repressed by floral activators within theautonomous pathway.

During vernalization, FLC transcript and protein levels fall, and theplants become competent to flower (Michaels and Amasino (1999) PlantCell 11: 949-956; Michaels and Amasino (2001) supra; Sheldon et al.(1999) supra; Sheldon et al. (2000) Proc. Natl. Acad. Sci. 97:3753-3758; Johanson et al. (2000) supra; and Rouse et al. (2002) supra).Additionally, overexpression of FLC from a 35^(S) CaMV promoter in theLandsberg ecotype (which lacks an active FRI allele) is sufficient toseverely delay or prevent flowering, and renders the plants insensitiveto vernalization (Michaels and Amasino (1999) supra; Sheldon et al.(1999) supra). These findings indicate that FLC is a potent floralrepressor; it has now been shown that such repression is achieved by FLCinhibiting downstream genes that promote flowering, including SOCI andFT (Borner et al. (2000) Plant J. 24: 591-599; Lee et al. (2000) GenesDev. 14: 2366-2376; Onouchi et al. (2000) Plant Cell 12: 885-900; Samachet al. (2000) Science 288: 1613-1616; Michaels and Amasino (2001)supra). Thus, promotion of flowering by either the autonomous pathway orvernalization involves repression of FLC and the subsequentde-repression of FLC targets. Recently, regions within the FLC gene andits promoter have been defined which are required for its vernalizationinduced repression (Sheldon et al. (2002) supra). However, the molecularsignaling events that lead to a fall in FLC levels during vernalizationare still unclear. The products of VERNALIZATION2 and VERNALIZATION1maintain repression of FLC, once levels of FLC transcript have declined(Gendall et al. (2001) Cell 107: 525-535; Levy et al. (2002) Science297: 243-246), but it is not yet known how the decline is initiallyachieved.

A number of additional questions, regarding the molecular basis ofvernalization, still remain unanswered. First, it has been observed thatnull flc mutants are responsive to vernalization (Michaels and Amasino(2001) supra). Therefore vernalization can promote flowering by othermechanisms as well as via repression of FLC. In addition, vernalizationis a quantitative response to prolonged periods of cold (Sheldon et al.(2000) supra); a mechanism must therefore exist to ensure thatvernalization does not always occur in response to short periods ofcold, lasting only a few days.

Vernalization may also be desirable in plants that do not normally havea vernalization response. Such plants in which expression of apolynucleotide creates a vernalization response therefore may bepropagated and cultivated at different latitudes and/or altitudescompared with the native plant species that do not express apolynucleotide creating a vernalization response.

The present invention relates to methods and compositions for producingtransgenic plants with modified traits, particularly traits that addressthe agricultural and food needs described in the above backgroundinformation. These traits may provide significant value in that theyallow the plant to thrive in hostile environments, where, for example,temperature, water and nutrient availability or salinity may limit orprevent growth of non-transgenic plants. The traits may also comprisedesirable morphological alterations, larger or smaller size, disease andpest resistance, alterations in flowering time, light response, andothers.

We have identified polynucleotides encoding transcription factors,developed numerous transgenic plants using these polynucleotides, andhave analyzed the plants for a variety of important traits. In so doing,we have identified important polynucleotide and polypeptide sequencesfor producing commercially valuable plants and crops as well as themethods for making them and using them. Other aspects and embodiments ofthe invention are described below and can be derived from the teachingsof this disclosure as a whole.

SUMMARY OF THE INVENTION

Transgenic plants and methods for producing transgenic plants areprovided. The transgenic plants a recombinant polynucleotide having apolynucleotide sequence, or a complementary polynucleotide sequencethereof, that encodes a transcription factor.

The polynucleotide sequences that may encode the transcription factorsare listed in the Sequence Listing and include any of SEQ ID NO: 2N-1,where N=1-480, SEQ ID NO: 2N, where N=856-969, or SEQ ID NO: 961, 962,963, 964, 965, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977,978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 992, 993, 994,995, 996, 997, 998, 999, 1000, 1003, 1004, 1005, 1006, 1007, 1008, 1009,1010, 1011, 1013, 1014, 1015, 1016, 1017, 1019, 1020, 1023, 1024, 1026,1030, 1031, 1032, 1033, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1042,1043, 1044, 1045, 1046, 1047, 1048, 1049, 1052, 1053, 1054, 1055, 1056,1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068,1069, 1070, 1071, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085,1086, 1087, 1088, 1089, 1090, 1096, 1097, 1098, 1099, 1100, 1101, 1102,1103, 1104, 1105, 1106, 1107, 1108, 1113, 1114, 1115, 1116, 1117, 1118,1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130,1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, 1141, 1142,1143, 1144, 1145, 1146, 1147, 1148, 1149, 1151, 1152, 1153, 1154, 1155,1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1170, 1171, 1172,1173, 1174, 1175, 1176, 1177, 1178, 1179, 1180, 1181, 1182, 1183, 1184,1185, 1186, 1187, 1188, 1192, 1193, 1194, 1195, 1196, 1200, 1201, 1202,1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211, 1212, 1217, 1218,1219, 1220, 1221, 1222, 1223, 1224, 1225, 1228, 1229, 1230, 1231, 1232,1234, 1235, 1236, 1237, 1238, 1240, 1241, 1242, 1243, 1244, 1245, 1248,1249, 1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1260, 1261, 1262,1263, 1264, 1265, 1266, 1267, 1268, 1270, 1271, 1272, 1273, 1274, 1275,1276, 1277, 1278, 1279, 1280, 1281, 1282, 1283, 1284, 1285, 1286, 1287,1288, 1289, 1290, 1291, 1292, 1293, 1294, 1295, 1296, 1297, 1298, 1299,1300, 1301, 1302, 1303, 1304, 1305, 1306, 1311, 1312, 1313, 1314, 1315,1316, 1317, 1318, 1319, 1320, 1321, 1322, 1324, 1325, 1326, 1327, 1328,1333, 1334, 1335, 1336, 1337, 1341, 1342, 1343, 1344, 1345, 1346, 1347,1348, 1349, 1350, 1351, 1352, 1353, 1354, 1355, 1356, 1357, 1358, 1359,1360, 1363, 1364, 1365, 1366, 1367, 1368, 1369, 1370, 1371, 1372, 1376,1377, 1378, 1379, 1380, 1381, 1382, 1383, 1385, 1386, 1387, 1388, 1392,1393, 1394, 1395, 1397, 1398, 1399, 1400, 1401, 1402, 1403, 1404, 1405,1406, 1407, 1408, 1409, 1410, 1415, 1416, 1417, 1418, 1419, 1420, 1421,1422, 1423, 1425, 1426, 1427, 1428, 1429, 1430, 1431, 1432, 1433, 1434,1438, 1439, 1440, 1441, 1442, 1443, 1444, 1445, 1446, 1447, 1449, 1450,1451, 1452, 1453, 1454, 1455, 1461, 1462, 1463, 1464, 1465, 1466, 1467,1468, 1469, 1470, 1471, 1473, 1474, 1475, 1476, 1477, 1478, 1479, 1480,1481, 1482, 1485, 1486, 1487, 1488, 1489, 1490, 1491, 1492, 1493, 1494,1495, 1496, 1497, 1498, 1499, 1501, 1502, 1503, 1504, 1505, 1506, 1507,1509, 1512, 1513, 1514, 1515, 1516, 1517, 1518, 1519, 1521, 1522, 1523,1524, 1525, 1526, 1527, 1528, 1529, 1530, 1531, 1532, 1533, 1534, 1535,1536, 1537, 1544, 1545, 1546, 1547, 1548, 1549, 1550, 1551, 1552, 1553,1554, 1555, 1556, 1557, 1558, 1559, 1560, 1561, 1562, 1570, 1571, 1572,1573, 1574, 1575, 1576, 1577, 1578, 1579, 1580, 1581, 1584, 1585, 1586,1587, 1588, 1589, 1590, 1591, 1592, 1593, 1595, 1596, 1597, 1598, 1599,1600, 1601, 1602, 1603, 1604, 1605, 1606, 1607, 1608, 1609, 1610, 1613,1614, 1615, 1616, 1617, 1621, 1622, 1623, 1624, 1625, 1628, 1629, 1630,1631, 1632, 1633, 1634, 1637, 1638, 1639, 1642, 1643, 1644, 1645, 1646,1647, 1648, 1649, 1650, 1651, 1652, 1653, 1654, 1659, 1660, 1661, 1662,1663, 1664, 1665, 1666, 1667, 1668, 1669, 1670, 1671, 1674, 1675, 1676,1677, 1678, 1679, 1681, 1683, 1684, 1685, 1690, 1691, 1692, 1693, 1694,1695, 1697, 1698, 1699, 1700, 1701, 1704, 1705, 1706, 1707, 1708, 1709,1710, 1711, 1944, 1946, 1948, 1950, 1952, 1954, 1956, 1958, 1960, 1962,1964, 1966, 1968, 1970, and 1972.

The transcription factors are comprised of polypeptide sequences listedin the Sequence Listing and include any of SEQ ID NO: 2N, whereinN=1-480, SEQ ID NO: 2N-1, where N=857-970, or SEQ ID NO: 989, 990, 991,1001, 1002, 1012, 1018, 1021, 1022, 1025, 1027, 1028, 1029, 1034, 1050,1051, 1072, 1073, 1074, 1075, 1076, 1091, 1092, 1093, 1094, 1095, 1109,1110, 1111, 1112, 1150, 1165, 1166, 1167, 1168, 1169, 1189, 1190, 1191,1197, 1198, 1199, 1213, 1214, 1215, 1216, 1226, 1227, 1233, 1239, 1246,1247, 1258, 1259, 1269, 1307, 1308, 1309, 1310, 1323, 1329, 1330, 1331,1332, 1338, 1339, 1340, 1361, 1362, 1373, 1374, 1375, 1384, 1389, 1390,1391, 1396, 1411, 1412, 1413, 1414, 1424, 1435, 1436, 1437, 1448, 1456,1457, 1458, 1459, 1460, 1472, 1483, 1484, 1500, 1508, 1510, 1511, 1520,1538, 1539, 1540, 1541, 1542, 1543, 1563, 1564, 1565, 1566, 1567, 1568,1569, 1582, 1583, 1594, 1611, 1612, 1618, 1619, 1620, 1626, 1627, 1635,1636, 1640, 1641, 1655, 1656, 1657, 1658, 1672, 1673, 1680, 1682, 1686,1687, 1688, 1689, 1696, 1702, 1703, 1945, 1947, 1949, 1951, 1953, 1955,1957, 1959, 1961, 1963, 1965, 1967, 1969, 1971, and 1973.

The transgenic plant that comprises the recombinant polynucleotide has apolynucleotide sequence (or a sequence that is complementary to thispolynucleotide sequence) selected from

-   -   (a) a nucleotide sequence that encodes one of the aforementioned        transcription factor polypeptide sequences; or    -   (b) a polypeptide sequence that comprises one of the        aforementioned transcription factor polypeptides.

In an example of a preferred embodiment, the transcription factorpolynucleotide sequence of (a) comprises G682, or SEQ ID NO: 467. Inanother example of a preferred embodiment, the transcription factorpolypeptide of (b) comprises G682, or SEQ ID NO: 468.

The transgenic plant may also comprise a polynucleotide sequence that isa variant of the sequences in (a) and (b) that encode a polypeptide andinitiate transcription, including:

-   -   (c) a sequence variant of the nucleotide sequences of (a) or        (b);    -   (d) an allelic variant of the nucleotide sequences of (a) or        (b);    -   (e) a splice variant of the nucleotide sequences of (a) or (b);    -   (f) an orthologous sequence of the nucleotide sequences of (a)        or (b);    -   (g) a paralogous sequence of the nucleotide sequences of (a) or        (b);    -   (h) a nucleotide sequence encoding a polypeptide comprising a        conserved domain that exhibits at least 70% sequence homology        with the polypeptide of (a), and the polypeptide comprises a        conserved domain that initiates transcription; or    -   (i) a nucleotide sequence that hybridizes under stringent        conditions to a nucleotide sequence of one or more        polynucleotides of (a) or (b), and the nucleotide sequence        encodes a polypeptide that initiates transcription.

A transcription factor sequence variant is one having at least 26% aminoacid sequence similarity, at least 40% amino acid sequence identity, apreferred transcription factor sequence variant is one having at least50% amino acid sequence identity and a more preferred transcriptionfactor sequence variant is one having at least 65% amino acid sequenceidentity to the transcription factor amino acid sequences SEQ ID NO: 2N,wherein N=1-480, SEQ ID NO: 2N-1, where N=857-970, or SEQ ID NO: 989,990, 991, 1001, 1002, 1012, 1018, 1021, 1022, 1025, 1027, 1028, 1029,1034, 1050, 1051, 1072, 1073, 1074, 1075, 1076, 1091, 1092, 1093, 1094,1095, 1109, 1110, 1111, 1112, 1150, 1165, 1166, 1167, 1168, 1169, 1189,1190, 1191, 1197, 1198, 1199, 1213, 1214, 1215, 1216, 1226, 1227, 1233,1239, 1246, 1247, 1258, 1259, 1269, 1307, 1308, 1309, 1310, 1323, 1329,1330, 1331, 1332, 1338, 1339, 1340, 1361, 1362, 1373, 1374, 1375, 1384,1389, 1390, 1391, 1396, 1411, 1412, 1413, 1414, 1424, 1435, 1436, 1437,1448, 1456, 1457, 1458, 1459, 1460, 1472, 1483, 1484, 1500, 1508, 1510,1511, 1520, 1538, 1539, 1540, 1541, 1542, 1543, 1563, 1564, 1565, 1566,1567, 1568, 1569, 1582, 1583, 1594, 1611, 1612, 1618, 1619, 1620, 1626,1627, 1635, 1636, 1640, 1641, 1655, 1656, 1657, 1658, 1672, 1673, 1680,1682, 1686, 1687, 1688, 1689, 1696, 1702, 1703, 1945, 1947, 1949, 1951,1953, 1955, 1957, 1959, 1961, 1963, 1965, 1967, 1969, 1971, and 1973,and which contains at least one functional or structural characteristicof the transcription factor amino acid sequences. Sequences havinglesser degrees of identity but comparable biological activity areconsidered to be equivalents.

The transcription factor polypeptides of the present invention includeat least one conserved domain, and the portions of the nucleotidesequences encoding the conserved domain exhibit at least 70% sequenceidentity with the aforementioned preferred nucleotide sequences. In thecase of zinc finger transcription factors, the percent identity acrossthe conserved domain may be as low as 50%.

The present invention also encompasses MAF transcription factor sequencevariants. A MAF transcription factor sequence variant is one having atleast 26% amino acid sequence similarity, at least 40% amino acidsequence identity, a preferred MAF transcription factor sequence variantis one having at least 50% amino acid sequence identity and a morepreferred MAF transcription factor sequence variant is one having atleast 65% amino acid sequence identity to the MAF transcription factorsequences SEQ ID NO: 568, SEQ ID NO: 944, SEQ ID NO: 946, SEQ ID NO:948, SEQ ID NO: 1735, SEQ ID NO: 1875, SEQ ID NO: 1971, SEQ ID NO: 1973,SEQ ID NO: 1945, SEQ ID NO: 1947, SEQ ID NO: 1949, SEQ ID NO: 1951, SEQID NO: 1953, SEQ ID NO: 1955, SEQ ID NO: 1957, SEQ ID NO: 1959, SEQ IDNO: 1961, SEQ ID NO: 1963, SEQ ID NO: 1965, SEQ ID NO: 1967, SEQ ID NO:1969, SEQ ID NO: 2010, or SEQ ID NO: 2011, and which contains at leastone functional or structural characteristic of the MAF transcriptionfactor amino acid sequences. In a further embodiment, the invention is apolynucleotide encoding a polypeptide having at least 36% amino acidresidue identity to a MAF transcription factor selected from the groupconsisting of SEQ ID NOs: 568, SEQ ID NO: 944, SEQ ID NO: 946, SEQ IDNO: 948, SEQ ID NO: 1735, SEQ ID NO: 1875, SEQ ID NO: 1971, SEQ ID NO:1973, SEQ ID NO: 1945, SEQ ID NO: 1947, SEQ ID NO: 1949, SEQ ID NO:1951, SEQ ID NO: 1953, SEQ ID NO: 1955, SEQ ID NO: 1957, SEQ ID NO:1959, SEQ ID NO: 1961, SEQ ID NO: 1963, SEQ ID NO: 1965, SEQ ID NO:1967, SEQ ID NO: 1969, SEQ ID NO: 2010, or SEQ ID NO: 2011, and havingMAF transcription factor activity. In a yet further embodiment theinvention is a polynucleotide encoding a polypeptide having a conserveddomain of a MAF transcription factor wherein the conserved domain has atleast 54% identity to the conserved domain of SEQ ID NO: 568 comprisingamino acid residues 2-74 of SEQ ID NO: 568. In a still furtherembodiment the invention is a polynucleotide encoding a polypeptidehaving a conserved domain of a MAF transcription factor wherein theconserved domain has at least 64% identity to the conserved domain ofSEQ ID NO: 568 comprising amino acid residues 2-57 of SEQ ID NO: 568.Sequences having lesser degrees of identity but comparable biologicalactivity are considered to be equivalents.

In a further aspect, the invention provides a progeny plant derived froma parental plant wherein said progeny plant exhibits, with respect to aspecific gene, at least three fold greater messenger RNA (mRNA) levelsthan said parental plant, wherein the mRNA encodes a DNA-binding proteinthat is capable of binding to a DNA regulatory sequence and inducingexpression of a plant trait gene, wherein the progeny plant ischaracterized by a change in the plant trait compared to said parentalplant. In yet a further aspect, the progeny plant exhibits at least tenfold greater mRNA levels compared to said parental plant. In yet afurther aspect, the progeny plant exhibits at least fifty fold greatermRNA levels compared to said parental plant.

Various types of plants may be used to generate the transgenic plants,including soybean, wheat, corn, potato, cotton, rice, oilseed rape,sunflower, alfalfa, clover, sugarcane, turf, banana, blackberry,blueberry, strawberry, raspberry, cantaloupe, carrot, cauliflower,coffee, cucumber, eggplant, grapes, honeydew, lettuce, mango, melon,onion, papaya, peas, peppers, pineapple, pumpkin, spinach, squash, sweetcorn, tobacco, tomato, watermelon, mint and other labiates, rosaceousfruits, and vegetable brassicas.

The transgenic plant may be monocotyledonous, plant, and thepolynucleotide sequences used to transform the transgenic plant may bederived from either a monocot or a dicot plant. Alternatively, thetransgenic plant may be dicotyledonous, plant, and the polynucleotidesequences used to transform the transgenic plant may be derived fromeither a monocot or a dicot plant.

These transgenic plants will generally possess traits that are alteredas compared to a control plant, such as a wild-type or non-transformedplant (i.e., the non-transformed plant does not comprise the recombinantpolynucleotide), thus producing an phenotype that is altered whencompared to the control, wild-type or non-transformed plant. Thesetransgenic plants may also express an altered level of one or more genesassociated with a plant trait as compared to the non-transformed plant.The encoded polypeptides in these transgenic plants will generally beexpressed and regulate transcription of at least one gene; this genewill generally confer at least one altered trait, phenotype orexpression level.

The polynucleotide sequences (those listed in the Sequence Listing,their complements, of functional variants) used to transform thetransgenic plants of the present invention may further compriseregulatory elements, for example, a constitutive, inducible, ortissue-specific promoter operably linked to the polynucleotide sequence.

Transformation of plants with presently disclosed transcription factorsequences will produce a variety of improved traits. For example, thealtered trait may be an enhanced tolerance to abiotic stress, such assalt tolerance. Salt tolerance, a form of osmotic stress, may bemediated by increased root growth or increased root hairs relative to anon-transformed, control or wild-type plant. Tolerance to abioticstresses such as salt tolerance may confer a number of survival, qualityand yield improvements, including improved seed germination and improvedseedling vigor, as well as improved yield, quality, and range.

Another example of an altered trait that may be conferred bytransforming plants with the presently disclosed transcription factorsequences includes altered sugar sensing. Altered sugar sensing may alsobe used to confer improved seed germination and improved seedling vigor,as well as altered flowering, senescence, sugar metabolism andphotosynthesis characteristics.

The invention also pertains to method to produce these transgenicplants.

The present invention also relates to a method of using transgenicplants transformed with the presently disclosed transcription factorsequences, their complements or their variants to grow a progeny plantby crossing the transgenic plant with either itself or another plant,selecting seed that develops as a result of the crossing; and thengrowing the progeny plant from the seed. The progeny plant willgenerally express mRNA that encodes a transcription factor: that is, aDNA-binding protein that binds to a DNA regulatory sequence and inducesexpression, such as that of a plant trait gene. The mRNA will generallybe expressed at a level greater than a non-transformed plant; and theprogeny plant is characterized by a change in a plant trait compared tothe non-transformed plant.

The present invention also pertains to an expression cassette. Theexpression cassette comprises at least two elements, including:

-   -   (1) a constitutive, inducible, or tissue-specific promoter; and    -   (2) a recombinant polynucleotide having a polynucleotide        sequence, or a complementary polynucleotide sequence thereof,        selected from the group consisting of a nucleotide sequence        encoding a polypeptide sequence selected from the transcription        factor sequences in the sequence listing, for example,        polypeptide sequence G682, SEQ ID NO: 468; a nucleotide sequence        selected from the transcription factor polynucleotides of the        Sequence Listing, for example, polynucleotide sequence G682, SEQ        ID NO: 467, or sequence variants such as allelic or splice        variants of the nucleotide sequences of (a) or (b), where the        sequence variant encodes a polypeptide that initiates        transcription. The nucleotide sequence may also comprise an        orthologous or paralogous sequence of the nucleotide sequences        of (a) or (b), and these sequences encodes a polypeptide that        initiates transcription, a nucleotide sequence that encodes a        polypeptide having a conserved domain that exhibits 72% or        greater sequence homology with the polypeptide of (a), where the        polypeptide comprising the conserved domain initiates        transcription, or a nucleotide sequence that hybridizes under        stringent conditions to a nucleotide sequence of one or more        polynucleotides of (a) or (b), where the latter nucleotide        sequence initiates transcription. In all of these cases, the        recombinant polynucleotide is operably linked to the promoter of        the expression cassette.

The invention includes a host cell that comprises the expressioncassette. The host cell may be a plant cell, such as a cell of a cropplant.

The invention also concerns a method for identifying at least onedownstream polynucleotide sequence that is subject to a regulatoryeffect of any of the polypeptide transcription factors of the presentinvention, or a sequence variant, ortholog or paralog of any of thesesequences. This method is conducted by expressing a polypeptidetranscription factor, variant, ortholog or paralog in a plant cell, andthen identifying an expression product, such as RNA or protein, producedas a result. The identification method used can be any method thatidentifies RNA or protein products of expression, such as, for example,Northern analysis, RT-PCR, microarray gene expression assays, reportergene expression systems subtractive hybridization, differential display,representational differential analysis, or by two-dimensional gelelectrophoresis of one or more protein products.

In another aspect the invention is a method of screening a plurality ofplants to identify at least one plant that comprises a polynucleotideencoding a MAF transcription factor protein wherein the expression ofthe polynucleotide alters at least one of the plant's traits. The methodcomprises (a) selecting a first polynucleotide from the group consistingof a combination of plant polynucleotide sequences of SEQ ID NO: 567,SEQ ID NO: 943, SEQ ID NO: 945, SEQ ID NO: 947, SEQ ID NO: 1734, SEQ IDNO: 1874, SEQ ID NO: 1014, SEQ ID NO: 1970, SEQ ID NO: 1972, SEQ ID NO:1944, SEQ ID NO: 1946, SEQ ID NO: 1948, SEQ ID NO: 1950, SEQ ID NO:1952, SEQ ID NO: 1954, SEQ ID NO: 1956, SEQ ID NO: 1958, SEQ ID NO:1960, SEQ ID NO: 1962, SEQ ID NO: 1964, SEQ ID NO: 1966, or SEQ ID NO:1968; (b) comparing the first polynucleotide sequence with a secondpolynucleotide sequence wherein the second polynucleotide sequence isisolated from a second plant; (c) selecting the second polynucleotidesequence, wherein the second polynucleotide sequence encodes apolypeptide sequence that has at least 60% identity with a polypeptidesequence encoded by the sequence of the first polynucleotide; (d)comparing the second polynucleotide sequence with a third polynucleotidesequence wherein the third polynucleotide sequence is isolated from athird plant, wherein the third plant is selected from a plurality ofplants; (e) selecting the third polynucleotide sequence, wherein thethird polynucleotide sequence encodes a polypeptide sequence that has atleast one amino acid substitution compared with a polypeptide sequenceencoded by the sequence of the second polynucleotide; (f) identifyingthe third plant from which the third polynucleotide came; (g) measuringthe expression level of the endogenous third polynucleotide sequence inanother third plant; (h) identifying which other third plant expressesthe third polynucleotide sequence; and (i) identifying a trait in theother third plant of step (h) which is changed when compared with thesame trait in the second plant, wherein the trait is selected from thegroup consisting of at least one trait listed below.

In another embodiment, the method further comprises the thirdpolynucleotide sequence that has at last one nucleotide basesubstitution compared with the polynucleotide sequence of the secondpolynucleotide.

BRIEF DESCRIPTION OF THE SEQUENCE LISTING, TABLES, AND DRAWINGS

The Sequence Listing provides exemplary polynucleotide and polypeptidesequences of the invention. The traits associated with the use of thesequences are included in the Examples.

CD-ROM1 (Copy 1) is a read-only memory computer-readable compact discand contains a copy of the Sequence Listing in ASCII text format and acopy of Table 8 as an MS Word document. The Sequence Listing is named“MB10048.ST25.txt” and is 4,267 kilobytes in size. Table 8 is named“Table 8.doc” and is 3,453 kilobytes in size. The copies of the SequenceListing and Table 8 on the CD-ROM disc are hereby incorporated byreference in their entirety.

CD-ROM2 (Copy 2) is an exact copy of CD-R1 (Copy 1).

CD-ROM3 contains a computer-readable format (CRF) copy of the SequenceListing as a text (.txt) file.

Table 8 lists a summary of homologous sequences identified using BLAST(tblastx program). The first column shows the polynucleotide sequenceidentifier (SEQ ID NO:), the second column shows the corresponding cDNAidentifier (Gene ID or GID), the third column shows the orthologous orhomologous polynucleotide GenBank Accession Number (Test Sequence ID),the fourth column shows the calculated probability value that thesequence identity is due to chance (Smallest Sum Probability), the fifthcolumn shows the plant species from which the test sequence was isolated(Test Sequence Species), and the sixth column shows the orthologous orhomologous test sequence GenBank annotation (Test Sequence GenBankAnnotation).

Table 14 shows the polypeptide sequence identities and similaritiesbetween exemplary polypeptides of the invention. The first column andfirst row shows the polypeptide SEQ ID NO (polypeptide SEQ ID NO); thesecond column and second row shows the Mendel Name (Name); the thirdcolumn and third row shows the Mendel Gene identifier number (Gene ID).The percentage identity, and percentage similarity in parentheses,between the two polypeptide sequences are indicated at the intersectionof each column and row.

Table 15 shows flowering times for the maf2 mutant. The first columnshows the genotype of the Arabidopsis plant tested. The second columnshows the number of plants tested; N=number. The third column shows thenumber of days the plant was cold-treated. The fourth column shows thephotoperiod in hours. The fifth column shows the number of days elapsedfrom beginning of cold treatment at which time flower buds were visible.The sixth column shows the total leaf primordia produced by primaryshoot meristem before first flower. Four separate experiments were done(Experiments 1 through 4).

Table 16 shows flowering times of transgenic plants over-expressing MAF2(SEQ ID NO: 567), MAF3 (SEQ ID NO: 943), MAF4 (SEQ ID NO: 945), and MAF5(SEQ ID NO: 947) in Arabidopsis Stockholm accession and Columbiaaccession transgenic T1 lines. The first column lists the genotype ofthe transgenic plants (Genotype). The second column shows the observedphenotype in the transgenic plants (Phenotype). The third column showsthe penetrance of the observed altered phenotype as a fraction of totaltransgenic plants (Penetrance). The fourth column shows the number ofdays to a visible flower bud in all transgenic plants (Days to visibleflower bud; range and (mean+/−S.E.M.)). The fifth column shows the totalleaf number of all transgenic plants (Total leaf number; range and(mean+/S.E.M.)).

Table 17 shows flowering times of transgenic plants over-expressing MAF2(SEQ ID NO: 567), MAF3 (SEQ ID NO: 943), MAF4 (SEQ ID NO: 945), and MAF5(SEQ ID NO: 947) in Arabidopsis T2 Columbia accession and Landsbergaccession transgenic T1 lines. The first column lists the genotype ofthe transgenic plants (Genotype). The second column shows the T2 lineanalyzed (Line). The third column shows the total number of plantsanalyzed (N). The fourth column shows the observed phenotype in the T1transgenic plants (Ti phenotype). The fifth column shows the observedphenotype in the T2 transgenic plants (T2 phenotype). The sixth columnshows the number of days of cold treatment for the T2 lines (Days ofcold treatment). The seventh column show the number of hours thetransgenic plants were exposed to light per day (Photoperiod (hours)).The eighth column shows the number of days to a visible flower bud inall transgenic plants (Days to visible flower bud; range and(mean+/−S.E.M.)). The ninth column shows the total leaf number of alltransgenic plants (Total leaf number; range and (mean+/−S.E.M.)). NA:not applicable; ND: not determined.

FIG. 1 shows a conservative estimate of phylogenetic relationships amongthe orders of flowering plants (modified from Angiosperm Phylogeny Group(1998) Ann. Missouri Bot. Gard. 84: 1-49). Those plants with a singlecotyledon (monocots) are a monophyletic clade nested within at least twomajor lineages of dicots; the eudicots are further divided into rosidsand asterids. Arabidopsis is a rosid eudicot classified within the orderBrassicales; rice is a member of the monocot order Poales. FIG. 1 wasadapted from Daly et al. (2001) Plant Physiol. 127: 1328-1333.

FIG. 2 shows a phylogenic dendogram depicting phylogenetic relationshipsof higher plant taxa, including clades containing tomato andArabidopsis; adapted from Ku et al. (2000) Proc. Natl. Acad. Sci. 97:9121-9126; and Chase et al. (1993) Ann. Missouri Bot. Gard. 80: 528-580.

FIGS. 3A, and 3B show an alignment of G682 (SEQ ID NO: 468) andpolypeptide sequences that are paralogous and orthologous to G682.

FIGS. 4A, 4B, 4C and 4D show an alignment of G867 (SEQ ID NO: 580) andpolypeptide sequences that are paralogous and orthologous to G867.

FIGS. 5A, 5B, 5C, 5D, 5E and 5F show an alignment of G912 (SEQ ID NO:616) and polypeptide sequences that are paralogous and orthologous toG912.

FIG. 6 shows an alignment of the polypeptide sequences of FLC (SEQ IDNO: 1875) and MAF1-5 (SEQ ID NOs: 1735, 568, 944, 946, and 948)full-length polypeptides encoded by SEQ ID NOs: 1874, 1734, 567, 943,945, and 947, respectively. The alignment was produced by manualalignment of the polypeptide sequences.

FIGS. 7A and 7B show an alignment of the polypeptide sequences of SEQ IDNO: 948 (G1844.pep; MAF5), SEQ ID NO: 946 (G1843.pep; MAF4), SEQ ID NO:944 (G1842.pep; MAF3), SEQ ID NO: 568 (G859.pep; MAF2), SEQ ID NO: 1735(G157.pep; MAF1), SEQ ID NO: 1875 (G1759.pep; FLC), SEQ ID NO: 1971(Soy1.pep; SOY MADS1), and SEQ ID NO: 1973 (Soy3.pep; SOY MADS3)full-length polypeptides encoded by SEQ ID NOs: 947, 945, 943, 567,1734, 1874, 1970, and 1972, respectively. Regions of identity betweenthe polypeptide sequences are boxed. The calculated consensus sequenceis shown beneath the alignments. The alignment was made using theCLUSTALW alignment program in the MACVECTOR sequence data package(MACVECTOR 6.0 or MACVECTOR 6.5 applications, Accelrys, San DiegoCalif.).

FIG. 8 shows the effects of vernalization on endogenous expression ofMAF2-5 (SEQ ID NOs: 567, 943, 945, and 947) in different geneticbackgrounds (accessions). Expression was monitored by RT-PCR (MAF1, 2,3, 4, and 5, and FLC transcripts). Vernalized (+) samples werecold-treated for 6 weeks at 4° C., whereas non-vernalized (−) sampleswere stratified for only 3 days at 4° C. as imbibed seeds. Col=Columbia,Pi−0=Pitztal, St−0=Stockholm, fca=fca−9 mutant.

FIG. 9 is a schematic diagram summarizing the responses of FLC andMAF1-5 (SEQ ID NOs: 567, 943, 945, and 947) to vernalization, and theirpotential effects on the floral transition. Arrows indicate positiveinteractions, blunt-ended lines denote inhibition.

FIG. 10 shows the effect of vernalization on the maf2 mutant: (A) in theabsence of vernalization; (B) following a vernalization treatment; (C)days to visible bud of maf2 mutant compared with wild type; and (D)RT-PCR transcript analysis of endogenous genes of maf2 mutant comparedwith those in wild type.

FIG. 11 shows the effects of MAF2 overexpression in the Columbiaecotype; (A) shows vernalized or non-vernalized transgenic 35S:MAF2plants and wild type plants; and (B) shows RT-PCR transcript analysis ofendogenous FLC, MAF2, and SOC1 transcripts in wild type (Columbiaaccession), transgenic 35S:MAF2, and transgenic 35S:FLC seedlings.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

In an important aspect, the present invention relates to polynucleotidesand polypeptides, for example, for modifying phenotypes of plants.Throughout this disclosure, various information sources are referred toand/or are specifically incorporated. The information sources includescientific journal articles, patent documents, textbooks, and World WideWeb browser-inactive page addresses, for example. While the reference tothese information sources clearly indicates that they can be used by oneof skill in the art, each and every one of the information sources citedherein are specifically incorporated in their entirety, whether or not aspecific mention of “incorporation by reference” is noted. The contentsand teachings of each and every one of the information sources can berelied on and used to make and use embodiments of the invention.

It must be noted that as used herein and in the appended claims, thesingular forms “a,” “an,” and “the” include plural reference unless thecontext clearly dictates otherwise. Thus, for example, a reference to “aplant” includes a plurality of such plants, and a reference to “astress” is a reference to one or more stresses and equivalents thereofknown to those skilled in the art, and so forth.

The polynucleotide sequences of the invention encode polypeptides thatare members of well-known transcription factor families, including planttranscription factor families, as disclosed in Tables 4-5. Generally,the transcription factors encoded by the present sequences are involvedin cell differentiation and proliferation and the regulation of growth.Accordingly, one skilled in the art would recognize that by expressingthe present sequences in a plant, one may change the expression ofautologous genes or induce the expression of introduced genes. Byaffecting the expression of similar autologous sequences in a plant thathave the biological activity of the present sequences, or by introducingthe present sequences into a plant, one may alter a plant's phenotype toone with improved traits. The sequences of the invention may also beused to transform a plant and introduce desirable traits not found inthe wild-type cultivar or strain. Plants may then be selected for thosethat produce the most desirable degree of over- or under-expression oftarget genes of interest and coincident trait improvement.

The sequences of the present invention may be from any species,particularly plant species, in a naturally occurring form or from anysource whether natural, synthetic, semi-synthetic or recombinant. Thesequences of the invention may also include fragments of the presentamino acid sequences. In this context, a “fragment” refers to a fragmentof a polypeptide sequence which is at least 5 to about 15 amino acids inlength, most preferably at least 14 amino acids, and which retain somebiological activity of a transcription factor. Where “amino acidsequence” is recited to refer to an amino acid sequence of a naturallyoccurring protein molecule, “amino acid sequence” and like terms are notmeant to limit the amino acid sequence to the complete native amino acidsequence associated with the recited protein molecule.

As one of ordinary skill in the art recognizes, transcription factorscan be identified by the presence of a region or domain of structuralsimilarity or identity to a specific consensus sequence or the presenceof a specific consensus DNA-binding site or DNA-binding site motif (see,for example, Riechmann et al. (2000) Science 290: 2105-2110). The planttranscription factors may belong to one of the following transcriptionfactor families: the AP2 (APETALA2) domain transcription factor family(Riechmann and Meyerowitz (1998) Biol. Chem. 379: 633-646); the MYBtranscription factor family (ENBib; Martin and Paz-Ares (1997) TrendsGenet. 13: 67-73); the MADS domain transcription factor family(Riechmann and Meyerowitz (1997) Biol. Chem. 378: 1079-1101; Immink etal. (2003) Mol. Gen. Genomics 268:598-606); the WRKY protein family(Ishiguro and Nakamura (1994) Mol. Gen. Genet. 244: 563-571); theankyrin-repeat protein family (Zhang et al. (1992) Plant Cell 4:1575-1588); the zinc finger protein (Z) family (Klug and Schwabe (1995)FASEB J. 9: 597-604); Takatsuji (1998) Cell. Mol. Life Sci. 54:582-596);the homeobox (HB) protein family (Buerglin (1994) in Guidebook to theHomeobox Genes, Duboule (ed.) Oxford University Press); the CAAT-elementbinding proteins (Forsburg and Guarente (1989) Genes Dev. 3: 1166-1178);the squamosa promoter binding proteins (SPB) (Klein et al. (1996) Mol.Gen. Genet. 1996 250: 7-16); the NAM protein family (Souer et al. (1996)Cell 85: 159-170); the IAA/AUX proteins (Abel et al. (1995) J. Mol.Biol. 251: 533-549); the HLH/MYC protein family (Littlewood et al.(1994) Prot. Profile 1: 639-709); the DNA-binding protein (DBP) family(Tucker et al. (1994) EMBO J. 13: 2994-3002); the bZIP family oftranscription factors (Foster et al. (1994) FASEB J. 8: 192-200); theBox P-binding protein (the BPF-1) family (da Costa e Silva et al. (1993)Plant J. 4: 125-135); the high mobility group (HMG) family (Bustin andReeves (1996) Prog. Nucl. Acids Res. Mol. Biol. 54: 35-100); thescarecrow (SCR) family (Di Laurenzio et al. (1996) Cell 86: 423-433);the GF14 family (Wu et al. (1997) Plant Physiol. 114: 1421-1431); thepolycomb (PCOMB) family (Goodrich et al. (1997) Nature 386: 44-51); theteosinte branched (TEO) family (Luo et al. (1996) Nature 383: 794-799);the ABI3 family (Giraudat et al. (1992) Plant Cell 4: 1251-1261); thetriple helix (TH) family (Dehesh et al. (1990) Science 250: 1397-1399);the EIL family (Chao et al. (1997) Cell 89: 1133-44); the AT-HOOK family(Reeves and Nissen (1990) J. Biol. Chem. 265: 8573-8582); the S1FAfamily (Zhou et al. (1995) Nucleic Acids Res. 23: 1165-1169); the bZIPT2family (Lu and Ferl (1995) Plant Physiol. 109: 723); the YABBY family(Bowman et al. (1999) Development 126: 2387-96); the PAZ family (Bohmertet al. (1998) EMBO J. 17: 170-80); a family of miscellaneous (MISC)transcription factors including the DPBF family (Kim et al. (1997)PlantJ 11: 1237-125I) and the SPF1 family (Ishiguro and Nakamura (1994)Mol. Gen. Genet. 244: 563-571); the GARP family (Hall et al. (1998)Plant Cell 10: 925-936), the TUBBY family (Boggin et al (1999) Science286: 2119-2125), the heat shock family (Wu (1995) Annu. Rev. Cell Dev.Biol. 11: 441-469), the ENBP family (Christiansen et al. (1996) PlantMol. Biol. 32: 809-821), the RING-zinc family (Jensen et al. (1998) FEBSLetters 436: 283287), the PDBP family (Janik et al. (1989) Virology 168:320-329), the PCF family (Cubas et al. Plant J. (1999) 18: 215-22), theSRS(SHI-related) family (Fridborg et al. (1999) Plant Cell 11:1019-1032), the CPP (cysteine-rich polycomb-like) family (Cvitanich etal. (2000) Proc. Natl. Acad. Sci. 97: 8163-8168), the ARF (auxinresponse factor) family (Ulmasov et al. (1999) Proc. Natl. Acad. Sci.96: 5844-5849), the SWI/SNF family (Collingwood et al. (1999) J. Mol.Endocrinol. 23: 255-275), the ACBF family (Seguin et al. (1997) PlantMol. Biol. 35: 281-291), PCGL (CG-1 like) family (da Costa e Silva etal. (1994) Plant Mol. Biol. 25: 921-924) the ARID family (Vazquez et al.(1999) Development 126: 733-742), the Jumonji family (Balciunas et al.(2000), Trends Biochem. Sci. 25: 274-276), the bZIP-NIN family (Schauseret al. (1999) Nature 402: 191-195), the E2F family (Kaelin et al. (1992)Cell 70: 351-364) and the GRF-like family (Knaap et al. (2000) PlantPhysiol. 122: 695-704). As indicated by any part of the list above andas known in the art, transcription factors have been sometimescategorized by class, family, and sub-family according to theirstructural content and consensus DNA-binding site motif, for example.Many of the classes and many of the families and sub-families are listedhere. However, the inclusion of one subfamily and not another, or theinclusion of one family and not another, does not mean that theinvention does not encompass polynucleotides or polypeptides of acertain family or sub-family. The list provided here is merely anexample of the types of transcription factors and the knowledgeavailable concerning the consensus sequences and consensus DNA-bindingsite motifs that help define them as known to those of skill in the art(each of the references noted above are specifically incorporated hereinby reference). A transcription factor may include, but is not limitedto, any polypeptide that can activate or repress transcription of asingle gene or a number of genes. This polypeptide group includes, butis not limited to, DNA-binding proteins, DNA-binding protein bindingproteins, protein kinases, protein phosphatases, proteinmethyltransferases, GTP-binding proteins, and receptors, and the like.

In addition to methods for modifying a plant phenotype by employing oneor more polynucleotides and polypeptides of the invention describedherein, the polynucleotides and polypeptides of the invention have avariety of additional uses. These uses include their use in therecombinant production (i.e., expression) of proteins; as regulators ofplant gene expression, as diagnostic probes for the presence ofcomplementary or partially complementary nucleic acids (including fordetection of natural coding nucleic acids); as substrates for furtherreactions, e.g., mutation reactions, PCR reactions, or the like; assubstrates for cloning e.g., including digestion or ligation reactions;and for identifying exogenous or endogenous modulators of thetranscription factors. A “polynucleotide” is a nucleic acid sequencecomprising a plurality of polymerized nucleotides, e.g., at least about15 consecutive polymerized nucleotides, optionally at least about 30consecutive nucleotides, at least about 50 consecutive nucleotides. Inmany instances, a polynucleotide comprises a nucleotide sequenceencoding a polypeptide (or protein) or a domain or fragment thereof.Additionally, the polynucleotide may comprise a promoter, an intron, anenhancer region, a polyadenylation site, a translation initiation site,5′ or 3′ untranslated regions, a reporter gene, a selectable marker, orthe like. The polynucleotide can be single stranded or double strandedDNA or RNA. The polynucleotide optionally comprises modified bases or amodified backbone. The polynucleotide can be, e.g., genomic DNA or RNA,a transcript (such as an mRNA), a cDNA, a PCR product, a cloned DNA, asynthetic DNA or RNA, or the like. The polynucleotide can comprise asequence in either sense or antisense orientations.

Definitions

A “recombinant polynucleotide” is a polynucleotide that is not in itsnative state, e.g., the polynucleotide comprises a nucleotide sequencenot found in nature, or the polynucleotide is in a context other thanthat in which it is naturally found, e.g., separated from nucleotidesequences with which it typically is in proximity in nature, or adjacent(or contiguous with) nucleotide sequences with which it typically is notin proximity. For example, the sequence at issue can be cloned into avector, or otherwise recombined with one or more additional nucleicacid.

An “isolated polynucleotide” is a polynucleotide whether naturallyoccurring or recombinant, that is present outside the cell in which itis typically found in nature, whether purified or not. Optionally, anisolated polynucleotide is subject to one or more enrichment orpurification procedures, e.g., cell lysis, extraction, centrifugation,precipitation, or the like.

A “polypeptide” is an amino acid sequence comprising a plurality ofconsecutive polymerized amino acid residues e.g., at least about 15consecutive polymerized amino acid residues, optionally at least about30 consecutive polymerized amino acid residues, at least about 50consecutive polymerized amino acid residues. In many instances, apolypeptide comprises a polymerized amino acid residue sequence that isa transcription factor or a domain or portion or fragment thereof.Additionally, the polypeptide may comprise 1) a localization domain, 2)an activation domain, 3) a repression domain, 4) an oligomerizationdomain, or 5) a DNA-binding domain, or the like. The polypeptideoptionally comprises modified amino acid residues, naturally occurringamino acid residues not encoded by a codon, non-naturally occurringamino acid residues.

A “recombinant polypeptide” is a polypeptide produced by translation ofa recombinant polynucleotide. A “synthetic polypeptide” is a polypeptidecreated by consecutive polymerization of isolated amino acid residuesusing methods well known in the art. An “isolated polypeptide,” whethera naturally occurring or a recombinant polypeptide, is more enriched in(or out of) a cell than the polypeptide in its natural state in awild-type cell, e.g., more than about 5% enriched, more than about 10%enriched, or more than about 20%, or more than about 50%, or more,enriched, i.e., alternatively denoted: 105%, 110%, 120%, 150% or more,enriched relative to wild type standardized at 100%. Such an enrichmentis not the result of a natural response of a wild-type plant.Alternatively, or additionally, the isolated polypeptide is separatedfrom other cellular components with which it is typically associated,e.g., by any of the various protein purification methods herein.

“Identity” or “similarity” refers to sequence similarity between twopolynucleotide sequences or between two polypeptide sequences, withidentity being a more strict comparison. The phrases “percent identity”and “% identity” refer to the percentage of sequence similarity found ina comparison of two or more polynucleotide sequences or two or morepolypeptide sequences. “Sequence similarity” refers to the percentsimilarity in base pair sequence (as determined by any suitable method)between two or more polynucleotide sequences. Two or more sequences canbe anywhere from 0-100% similar, or any integer value therebetween.Identity or similarity can be determined by comparing a position in eachsequence that may be aligned for purposes of comparison. When a positionin the compared sequence is occupied by the same nucleotide base oramino acid, then the molecules are identical at that position. A degreeof similarity or identity between polynucleotide sequences is a functionof the number of identical or matching nucleotides at positions sharedby the polynucleotide sequences. A degree of identity of polypeptidesequences is a function of the number of identical amino acids atpositions shared by the polypeptide sequences. A degree of homology orsimilarity of polypeptide sequences is a function of the number of aminoacids at positions shared by the polypeptide sequences.

“Alignment” refers to a number of DNA or amino acid sequences aligned bylengthwise comparison so that components in common (i.e., nucleotidebases or amino acid residues) may be readily and graphically identified.The number of components in common is related to the homology oridentity between the sequences. Alignments such as those of FIG. 3, 4 or5 may be used to identify “conserved domains” and relatedness withinthese domains. An alignment may suitably be determined by means ofcomputer programs known in the art, such as MacVector (1999) (Accelrys,Inc., San Diego, Calif.).

The terms “highly stringent” or “highly stringent condition” refer toconditions that permit hybridization of DNA strands whose sequences arehighly complementary, wherein these same conditions excludehybridization of significantly mismatched DNAS. Polynucleotide sequencescapable of hybridizing under stringent conditions with thepolynucleotides of the present invention may be, for example, variantsof the disclosed polynucleotide sequences, including allelic or splicevariants, or sequences that encode orthologs or paralogs of presentlydisclosed polypeptides. Nucleic acid hybridization methods are disclosedin detail by Kashima et al. (1985) Nature 313:402-404, and Sambrook etal. (1989) Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold SpringHarbor Laboratory, Cold Spring Harbor, N.Y. (“Sambrook”); and by Haymeset al., “Nucleic Acid Hybridization: A Practical Approach”, IRL Press,Washington, D.C. (1985), which references are incorporated herein byreference.

In general, stringency is determined by the temperature, ionic strength,and concentration of denaturing agents (e.g., formamide) used in ahybridization and washing procedure (for a more detailed description ofestablishing and determining stringency, see below). The degree to whichtwo nucleic acids hybridize under various conditions of stringency iscorrelated with the extent of their similarity. Thus, similar nucleicacid sequences from a variety of sources, such as within a plant'sgenome (as in the case of paralogs) or from another plant (as in thecase of orthologs) that may perform similar functions can be isolated onthe basis of their ability to hybridize with known transcription factorsequences. Numerous variations are possible in the conditions and meansby which nucleic acid hybridization can be performed to isolatetranscription factor sequences having similarity to transcription factorsequences known in the art and are not limited to those explicitlydisclosed herein. Such an approach may be used to isolate polynucleotidesequences having various degrees of similarity with disclosedtranscription factor sequences, such as, for example, transcriptionfactors having 60% identity, or more preferably greater than about 70%identity, most preferably 72% or greater identity with disclosedtranscription factors.

The term “equivalog” describes members of a set of homologous proteinsthat are conserved with respect to function since their last commonancestor. Related proteins are grouped into equivalog families, andotherwise into protein families with other hierarchically definedhomology types (definition provided the Institute for Genomic Research(TIGR) website).

The term “variant”, as used herein, may refer to polynucleotides orpolypeptides, that differ from the presently disclosed polynucleotidesor polypeptides, respectively, in sequence from each other, and as setforth below.

With regard to polynucleotide variants, differences between presentlydisclosed polynucleotides and polynucleotide variants are limited sothat the nucleotide sequences of the former and the latter are closelysimilar overall and, in many regions, identical. Due to the degeneracyof the genetic code, differences between the former and latternucleotide sequences o may be silent (i.e., the amino acids encoded bythe polynucleotide are the same, and the variant polynucleotide sequenceencodes the same amino acid sequence as the presently disclosedpolynucleotide. Variant nucleotide sequences may encode different aminoacid sequences, in which case such nucleotide differences will result inamino acid substitutions, additions, deletions, insertions, truncationsor fusions with respect to the similar disclosed polynucleotidesequences. These variations result in polynucleotide variants encodingpolypeptides that share at least one functional characteristic. Thedegeneracy of the genetic code also dictates that many different variantpolynucleotides can encode identical and/or substantially similarpolypeptides in addition to those sequences illustrated in the SequenceListing.

Also within the scope of the invention is a variant of a transcriptionfactor nucleic acid listed in the Sequence Listing, that is, one havinga sequence that differs from the one of the polynucleotide sequences inthe Sequence Listing, or a complementary sequence, that encodes afunctionally equivalent polypeptide (i.e., a polypeptide having somedegree of equivalent or similar biological activity) but differs insequence from the sequence in the Sequence Listing, due to degeneracy inthe genetic code. Included within this definition are polymorphisms thatmay or may not be readily detectable using a particular oligonucleotideprobe of the polynucleotide encoding polypeptide, and improper orunexpected hybridization to allelic variants, with a locus other thanthe normal chromosomal locus for the polynucleotide sequence encodingpolypeptide.

“Allelic variant” or “polynucleotide allelic variant” refers to any oftwo or more alternative forms of a gene occupying the same chromosomallocus. Allelic variation arises naturally through mutation, and mayresult in phenotypic polymorphism within populations. Gene mutations maybe “silent” or may encode polypeptides having altered amino acidsequence. “Allelic variant” and “polypeptide allelic variant” may alsobe used with respect to polypeptides, and in this case the term refer toa polypeptide encoded by an allelic variant of a gene.

“Splice variant” or “polynucleotide splice variant” as used hereinrefers to alternative forms of RNA transcribed from a gene. Splicevariation naturally occurs as a result of alternative sites beingspliced within a single transcribed RNA molecule or between separatelytranscribed RNA molecules, and may result in several different forms ofmRNA transcribed from the same gene. This, splice variants may encodepolypeptides having different amino acid sequences, which may or may nothave similar functions in the organism. “Splice variant” or “polypeptidesplice variant” may also refer to a polypeptide encoded by a splicevariant of a transcribed mRNA.

As used herein, “polynucleotide variants” may also refer topolynucleotide sequences that encode paralogs and orthologs of thepresently disclosed polypeptide sequences. “Polypeptide variants” mayrefer to polypeptide sequences that are paralogs and orthologs of thepresently disclosed polypeptide sequences.

Differences between presently disclosed polypeptides and polypeptidevariants are limited so that the sequences of the former and the latterare closely similar overall and, in many regions, identical. Presentlydisclosed polypeptide sequences and similar polypeptide variants maydiffer in amino acid sequence by one or more substitutions, additions,deletions, fusions and truncations, which may be present in anycombination. These differences may produce silent changes and result ina functionally equivalent transcription factor. Thus, it will be readilyappreciated by those of skill in the art, that any of a variety ofpolynucleotide sequences is capable of encoding the transcriptionfactors and transcription factor homolog polypeptides of the invention.A polypeptide sequence variant may have “conservative” changes, whereina substituted amino acid has similar structural or chemical properties.Deliberate amino acid substitutions may thus be made on the basis ofsimilarity in polarity, charge, solubility, hydrophobicity,hydrophilicity, and/or the amphipathic nature of the residues, as longas the functional or biological activity of the transcription factor isretained. For example, negatively charged amino acids may includeaspartic acid and glutamic acid, positively charged amino acids mayinclude lysine and arginine, and amino acids with uncharged polar headgroups having similar hydrophilicity values may include leucine,isoleucine, and valine; glycine and alanine; asparagine and glutamine;serine and threonine; and phenylalanine and tyrosine (for more detail onconservative substitutions, see Table 2). More rarely, a variant mayhave “non-conservative” changes, e.g., replacement of a glycine with atryptophan. Similar minor variations may also include amino aciddeletions or insertions, or both. Related polypeptides may comprise, forexample, additions and/or deletions of one or more N-linked or O-linkedglycosylation sites, or an addition and/or a deletion of one or morecysteine residues. Guidance in determining which and how many amino acidresidues may be substituted, inserted or deleted without abolishingfunctional or biological activity may be found using computer programswell known in the art, for example, DNASTAR software (see U.S. Pat. No.5,840,544).

The term “plant” includes whole plants, shoot vegetativeorgans/structures (e.g., leaves, stems and tubers), roots, flowers andfloral organs/structures (e.g., bracts, sepals, petals, stamens,carpels, anthers and ovules), seed (including embryo, endosperm, andseed coat) and fruit (the mature ovary), plant tissue (e.g., vasculartissue, ground tissue, and the like) and cells (e.g., guard cells, eggcells, and the like), and progeny of same. The class of plants that canbe used in the method of the invention is generally as broad as theclass of higher and lower plants amenable to transformation techniques,including angiosperms (monocotyledonous and dicotyledonous plants),gymnosperms, ferns, horsetails, psilophytes, lycophytes, bryophytes, andmulticellular algae. (See for example, FIG. 1, adapted from Daly et al.(2001) Plant Physiol. 127: 1328-1333; FIG. 2, adapted from Ku et al.(2000) Proc. Natl. Acad. Sci. 97: 9121-9126; and see also Tudge, in TheVariety of Life, Oxford University Press, New York, N.Y. (2000) pp.547-606).

A “transgenic plant” refers to a plant that contains genetic materialnot found in a wild-type plant of the same species, variety or cultivar.The genetic material may include a transgene, an insertional mutagenesisevent (such as by transposon or T-DNA insertional mutagenesis), anactivation tagging sequence, a mutated sequence, a homologousrecombination event or a sequence modified by chimeraplasty. Typically,the foreign genetic material has been introduced into the plant by humanmanipulation, but any method can be used as one of skill in the artrecognizes.

A transgenic plant may contain an expression vector or cassette. Theexpression cassette typically comprises a polypeptide-encoding sequenceoperably linked (i.e., under regulatory control of) to appropriateinducible or constitutive regulatory sequences that allow for theexpression of polypeptide. The expression cassette can be introducedinto a plant by transformation or by breeding after transformation of aparent plant. A plant refers to a whole plant as well as to a plantpart, such as seed, fruit, leaf, or root, plant tissue, plant cells orany other plant material, e.g., a plant explant, as well as to progenythereof, and to in vitro systems that mimic biochemical or cellularcomponents or processes in a cell.

“Fragment”, with respect to a polynucleotide, refers to a clone or anypart of a polynucleotide molecule that retains a usable, functionalcharacteristic. Useful fragments include oligonucleotides andpolynucleotides that may be used in hybridization or amplificationtechnologies or in the regulation of replication, transcription ortranslation. A polynucleotide fragment” refers to any subsequence of apolynucleotide, typically, of at least about 9 consecutive nucleotides,preferably at least about 30 nucleotides, more preferably at least about50 nucleotides, of any of the sequences provided herein. Exemplarypolynucleotide fragments are the first sixty consecutive nucleotides ofthe transcription factor polynucleotides listed in the Sequence Listing.Exemplary fragments also include fragments that comprise a region thatencodes a conserved domain of a transcription factor.

Fragments may also include subsequences of polypeptides and proteinmolecules, or a subsequence of the polypeptide. Fragments may have usesin that they may have antigenic potential. In some cases, the fragmentor domain is a subsequence of the polypeptide which performs at leastone biological function of the intact polypeptide in substantially thesame manner, or to a similar extent, as does the intact polypeptide. Forexample, a polypeptide fragment can comprise a recognizable structuralmotif or functional domain such as a DNA-binding site or domain thatbinds to a DNA promoter region, an activation domain, or a domain forprotein-protein interactions, and may initiate transcription. Fragmentscan vary in size from as few as 3 amino acids to the full length of theintact polypeptide, but are preferably at least about 30 amino acids inlength and more preferably at least about 60 amino acids in length.Exemplary polypeptide fragments are the first twenty consecutive aminoacids of a mammalian protein encoded by are the first twenty consecutiveamino acids of the transcription factor polypeptides listed in theSequence Listing. Exemplary fragments also include fragments thatcomprise a conserved domain of a transcription factor, for example,amino acid residues 27-63 of G682 (SEQ ID NO: 468), as noted in Table 5.

The invention also encompasses production of DNA sequences that encodetranscription factors and transcription factor derivatives, or fragmentsthereof, entirely by synthetic chemistry. After production, thesynthetic sequence may be inserted into any of the many availableexpression vectors and cell systems using reagents well known in theart. Moreover, synthetic chemistry may be used to introduce mutationsinto a sequence encoding transcription factors or any fragment thereof.

A “conserved domain” or “conserved region” as used herein refers to aregion in heterologous polynucleotide or polypeptide sequences wherethere is a relatively high degree of sequence identity between thedistinct sequences.

With respect to polynucleotides encoding presently disclosedtranscription factors, a conserved region is preferably at least 10 basepairs (bp) in length.

A “conserved domain”, with respect to presently disclosed polypeptidesrefers to a domain within a transcription factor family that exhibits ahigher degree of sequence homology, such as at least 26% sequencesimilarity, at least 16% sequence identity, preferably at least 40%sequence identity, preferably at least 65% sequence identity includingconservative substitutions, and more preferably at least 80% sequenceidentity, and even more preferably at least 85%, or at least about 86%,or at least about 87%, or at least about 88%, or at least about 90%, orat least about 95%, or at least about 98% amino acid residue sequenceidentity of a polypeptide of consecutive amino acid residues. A fragmentor domain can be referred to as outside a conserved domain, outside aconsensus sequence, or outside a consensus DNA-binding site that isknown to exist or that exists for a particular transcription factorclass, family, or subfamily. In this case, the fragment or domain willnot include the exact amino acids of a consensus sequence or consensusDNA-binding site of a transcription factor class, family or sub-family,or the exact amino acids of a particular transcription factor consensussequence or consensus DNA-binding site. Furthermore, a particularfragment, region, or domain of a polypeptide, or a polynucleotideencoding a polypeptide, can be “outside a conserved domain” if all theamino acids of the fragment, region, or domain fall outside of a definedconserved domain(s) for a polypeptide or protein. Sequences havinglesser degrees of identity but comparable biological activity areconsidered to be equivalents.

As one of ordinary skill in the art recognizes, conserved domains may beidentified as regions or domains of identity to a specific consensussequence (see, for example, Riechmann et al. (2000) supra). Thus, byusing alignment methods well known in the art, the conserved domains ofthe plant transcription factors for each of the following may bedetermined: the AP2 (APETALA2) domain transcription factor family(Riechmann and Meyerowitz (1998) supra; the MYB transcription factorfamily (ENBib; Martin and Paz-Ares (1997) supra); the MADS domaintranscription factor family (Riechmann and Meyerowitz (1997) supra;Immink et al. (2003) supra); the WRKY protein family (Ishiguro andNakamura (1994) supra); the ankyrin-repeat protein family (Zhang et al.(1992) supra); the zinc finger protein (Z) family (Klug and Schwabe(1995) supra; Takatsuji (1998) supra); the homeobox (HB) protein family(Buerglin (1994) supra); the CAAT-element binding proteins (Forsburg andGuarente (1989) supra); the squamosa promoter binding proteins (SPB)(Klein et al. (1996) supra); the NAM protein family (Souer et al. (1996)supra); the IAA/AUX proteins (Abel et al. (1995) supra); the HLH/MYCprotein family (Littlewood et al. (1994) supra); the DNA-binding protein(DBP) family (Tucker et al. (1994) supra); the bZIP family oftranscription factors (Foster et al. (1994) supra); the Box P-bindingprotein (the BPF-1) family (da Costa e Silva et al. (1993) supra); thehigh mobility group (HMG) family (Bustin and Reeves (1996) supra); thescarecrow (SCR) family (Di Laurenzio et al. (1996) supra); the GF14family (Wu et al. (1997) supra); the polycomb (PCOMB) family (Goodrichet al. (1997) supra); the teosinte branched (TEO) family (Luo et al.(1996) supra); the ABI3 family (Giraudat et al. (1992) supra); thetriple helix (TH) family (Dehesh et al. (1990) supra); the EIL family(Chao et al. (1997) Cell supra); the AT-HOOK family (Reeves and Nissen(1990 supra); the SIFA family (Zhou et al. (1995) supra); the bZIPT2family (Lu and Ferl (1995) supra); the YABBY family (Bowman et al.(1999) supra); the PAZ family (Bohmert et al. (1998) supra); a family ofmiscellaneous (MISC) transcription factors including the DPBF family(Kim et al. (1997) supra) and the SPF1 family (Ishiguro and Nakamura(1994) supra); the GARP family (Hall et al. (1998) supra), the TUBBYfamily (Boggin et al. (1999) supra), the heat shock family (Wu (1995supra), the ENBP family (Christiansen et al. (1996) supra), theRING-zinc family (Jensen et al. (1998) supra), the PDBP family (Janik etal. (1989) supra), the PCF family (Cubas et al. (1999) supra), theSRS(SHI-related) family (Fridborg et al. (1999) supra), the CPP(cysteine-rich polycomb-like) family (Cvitanich et al. (2000) supra),the ARF (auxin response factor) family (Ulmasov et al. (1999) supra),the SWI/SNF family (Collingwood et al. (1999) supra), the ACBF family(Seguin et al. (1997) supra), PCGL (CG-1 like) family (da Costa e Silvaet al. (1994) supra) the ARID family (Vazquez et al. (1999) supra), theJumonji family, (Balciunas et al. (2000) supra), the bZIP-NIN family(Schauser et al. (1999) supra), the E2F family Kaelin et al. (1992)supra) and the GRF-like family (Knaap et al (2000) supra).

The conserved domains for each of polypeptides of SEQ ID NO: 2N, whereinN=1-480, are listed in Table 5 as described in Example VII. Also, manyof the polypeptides of Table 5 have conserved domains specificallyindicated by start and stop sites. A comparison of the regions of thepolypeptides in SEQ ID NO: 2N, wherein N=1-480, or of those in Table 5,allows one of skill in the art to identify conserved domain(s) for anyof the polypeptides listed or referred to in this disclosure, includingthose in Tables 4-8.

The conserved domains for each of polypeptides of SEQ ID NOs: 568, 944,946, 948, 1735, 1875, 1971, and 1973, are listed in Table 5 as describedin Example VII. Also, many of the polypeptides of Table 5 have conserveddomains specifically indicated by start and stop sites. A comparison ofthe regions of the polypeptides in SEQ ID NO: 568, SEQ ID NO: 944, SEQID NO: 946, SEQ ID NO: 948, SEQ ID NO: 1735, SEQ ID NO: 1875, SEQ ID NO:1971, SEQ ID NO: 1973, SEQ ID NO: 1945, SEQ ID NO: 1947, SEQ ID NO:1949, SEQ ID NO: 1951, SEQ ID NO: 1953, SEQ ID NO: 1955, SEQ ID NO:1957, SEQ ID NO: 1959, SEQ ID NO: 1961, SEQ ID NO: 1963, SEQ ID NO:1965, SEQ ID NO: 1967, or SEQ ID NO: 1969, or of those in Table 5,allows one of skill in the art to identify conserved domain(s) for anyof the polypeptides listed or referred to in this disclosure, includingthose in Tables 4 8.

A gene is a functional unit of inheritance, and in physical terms is aparticular segment or sequence of nucleotides along a molecule of DNA(or RNA, in the case of RNA viruses) involved in producing a polypeptidechain. The latter may be subjected to subsequent processing such assplicing and folding to obtain a functional protein or polypeptide. Agene may be isolated, partially isolated, or be found with an organism'sgenome. By way of example, a transcription factor gene encodes atranscription factor polypeptide, which may be functional or requireprocessing to function as an initiator of transcription.

Operationally, genes may be defined by the cis-trans test, a genetictest that determines whether two mutations occur in the same gene andwhich may be used to determine the limits of the genetically active unit(Rieger et al. (1976) Glossary of Genetics and Cytogenetics: Classicaland Molecular, 4th ed., Springer Verlag. Berlin). A gene generallyincludes regions preceding (“leaders”; upstream) and following(“trailers”; downstream) of the coding region. A gene may also includeintervening, non-coded sequences, referred to as “introns”, locatedbetween individual coding segments, referred to as “exons”. Most geneshave an associated promoter region, a regulatory sequence 5′ of thetranscription initiation codon (there are some genes that do not have anidentifiable promoter). The function of a gene may also be regulated byenhancers, operators, and other regulatory elements.

A “trait” refers to a physiological, morphological, biochemical, orphysical characteristic of a plant or particular plant material or cell.In some instances, this characteristic is visible to the human eye, suchas seed or plant size, or can be measured by biochemical techniques,such as detecting the protein, starch, or oil content of seed or leaves,or by observation of a metabolic or physiological process, e.g. bymeasuring uptake of carbon dioxide, or by the observation of theexpression level of a gene or genes, e.g., by employing Northernanalysis, RT-PCR, microarray gene expression assays, or reporter geneexpression systems, or by agricultural observations such as stresstolerance, yield, or pathogen tolerance. Any technique can be used tomeasure the amount of, comparative level of, or difference in anyselected chemical compound or macromolecule in the transgenic plants,however.

“Trait modification” refers to a detectable difference in acharacteristic in a plant ectopically expressing a polynucleotide orpolypeptide of the present invention relative to a plant not doing so,such as a wild-type plant. In some cases, the trait modification can beevaluated quantitatively. For example, the trait modification can entailat least about a 2% increase or decrease in an observed trait(difference), at least a 5% difference, at least about a 10% difference,at least about a 20% difference, at least about a 30%, at least about at50%, at least about a 70%, or at least about a 100%, or an even greaterdifference compared with a wild-type plant. It is known that there canbe a natural variation in the modified trait. Therefore, the traitmodification observed entails a change of the normal distribution of thetrait in the plants compared with the distribution observed in wild-typeplant.

“Wild type”, as used herein, refers to a cell, tissue or plant that hasnot been genetically modified to knock out or overexpress one or more ofthe presently disclosed transcription factors. Wild-type cells, tissueor plants may be used as controls to compare levels of expression andthe extent and nature of trait modification with cells, tissue or plantsin which transcription factor expression is altered or ectopicallyexpressed, e.g., in that it has been knocked out or overexpressed.

The term “transcript profile” refers to the expression levels of a setof genes in a cell in a particular state, particularly by comparisonwith the expression levels of that same set of genes in a cell of thesame type in a reference state. For example, the transcript profile of aparticular transcription factor in a suspension cell is the expressionlevels of a set of genes in a cell overexpressing that transcriptionfactor compared with the expression levels of that same set of genes ina suspension cell that has normal levels of that transcription factor.The transcript profile can be presented as a list of those genes whoseexpression level is significantly different between the two treatments,and the difference ratios. Differences and similarities betweenexpression levels may also be evaluated and calculated using statisticaland clustering methods.

“Ectopic expression” or “altered expression” in reference to apolynucleotide indicates that the pattern of expression in, e.g., atransgenic plant or plant tissue, is different from the expressionpattern in a wild-type plant or a reference plant of the same species.The pattern of expression may also be compared with a referenceexpression pattern in a wild-type plant of the same species. Forexample, the polynucleotide or polypeptide is expressed in a cell ortissue type other than a cell or tissue type in which the sequence isexpressed in the wild-type plant, or by expression at a time other thanat the time the sequence is expressed in the wild-type plant, or by aresponse to different inducible agents, such as hormones orenvironmental signals, or at different expression levels (either higheror lower) compared with those found in a wild-type plant. The term alsorefers to altered expression patterns that are produced by lowering thelevels of expression to below the detection level or completelyabolishing expression. The resulting expression pattern can be transientor stable, constitutive or inducible. In reference to a polypeptide, theterm “ectopic expression or altered expression” further may relate toaltered activity levels resulting from the interactions of thepolypeptides with exogenous or endogenous modulators or frominteractions with factors or as a result of the chemical modification ofthe polypeptides.

The term “overexpression” as used herein refers to a greater expressionlevel of a gene in a plant, plant cell or plant tissue, compared toexpression in a wild-type plant, cell or tissue, at any developmental ortemporal stage for the gene. Overexpression can occur when, for example,the genes encoding one or more transcription factors are under thecontrol of a strong expression signal, such as one of the promotersdescribed herein (e.g., the cauliflower mosaic virus 35S transcriptioninitiation region). Overexpression may occur throughout a plant or inspecific tissues of the plant, depending on the promoter used, asdescribed below.

Overexpression may take place in plant cells normally lacking expressionof polypeptides functionally equivalent or identical to the presenttranscription factors. Overexpression may also occur in plant cellswhere endogenous expression of the present transcription factors orfunctionally equivalent molecules normally occurs, but such normalexpression is at a lower level. Overexpression thus results in a greaterthan normal production, or “overproduction” of the transcription factorin the plant, cell or tissue.

The term “phase change” refers to a plant's progression from embryo toadult, and, by some definitions, the transition wherein flowering plantsgain reproductive competency. It is believed that phase change occurseither after a certain number of cell divisions in the shoot apex of adeveloping plant, or when the shoot apex achieves a particular distancefrom the roots. Thus, altering the timing of phase changes may affect aplant's size, which, in turn, may affect yield and biomass.

Traits That May Be Modified in Overexpressing or Knock-Out Plants

Trait modifications of particular interest include those to seed (suchas embryo or endosperm), fruit, root, flower, leaf, stem, shoot,seedling or the like, including: enhanced tolerance to environmentalconditions including freezing, chilling, heat, drought, watersaturation, radiation and ozone; improved tolerance to microbial, fungalor viral diseases; improved tolerance to pest infestations, includinginsects, nematodes, mollicutes, parasitic higher plants or the like;decreased herbicide sensitivity; improved tolerance of heavy metals orenhanced ability to take up heavy metals; improved growth under poorphotoconditions (e.g., low light and/or short day length), or changes inexpression levels of genes of interest. Other phenotype that can bemodified relate to the production of plant metabolites, such asvariations in the production of taxol, tocopherol, tocotrienol, sterols,phytosterols, vitamins, wax monomers, anti-oxidants, amino acids,lignins, cellulose, tannins, prenyllipids (such as chlorophylls andcarotenoids), glucosinolates, and terpenoids, enhanced orcompositionally altered protein or oil production (especially in seeds),or modified sugar (insoluble or soluble) and/or starch composition.Physical plant characteristics that can be modified include celldevelopment (such as the number of trichomes), fruit and seed size andnumber, yields of plant parts such as stems, leaves, inflorescences, androots, the stability of the seeds during storage, characteristics of theseed pod (e.g., susceptibility to shattering), root hair length andquantity, internode distances, or the quality of seed coat. Plant growthcharacteristics that can be modified include growth rate, germinationrate of seeds, vigor of plants and seedlings, leaf and flowersenescence, male sterility, apomixis, flowering time, flower abscission,rate of nitrogen uptake, osmotic sensitivity to soluble sugarconcentrations, biomass or transpiration characteristics, as well asplant architecture characteristics such as apical dominance, branchingpatterns, number of organs, organ identity, organ shape or size.

Transcription Factors Modify Expression of Endogenous Genes

Expression of genes that encode transcription factors that modifyexpression of endogenous genes, polynucleotides, and proteins are wellknown in the art. In addition, transgenic plants comprising isolatedpolynucleotides encoding transcription factors may also modifyexpression of endogenous genes, polynucleotides, and proteins. Examplesinclude Peng et al. (1997) Genes and Development 11: 3194-3205, and Penget al. (1999) Nature 400: 256-261. In addition, many others havedemonstrated that an Arabidopsis transcription factor expressed in anexogenous plant species elicits the same or very similar phenotypicresponse. See, for example, Fu et al. (2001) Plant Cell 13: 1791-1802;Nandi et al. (2000, Curr. Biol. 10: 215-218; Coupland (1995) Nature 377:482-483; and Weigel and Nilsson (1995) Nature 377: 482-500.

In another example, Mandel et al. (1992) Cell 71-133-143 and Suzuki etal. (2001) PlantJ. 28: 409-418, teach that a transcription factorexpressed in another plant species elicits the same or very similarphenotypic response of the endogenous sequence, as often predicted inearlier studies of Arabidopsis transcription factors in Arabidopsis (seeMandel et al. (1992) supra; Suzuki et al. (2001) supra).

Other examples include Muller et al. (2001) Plant J. 28: 169-179; Kim etal. (2001) Plant J. 25: 247-259; Kyozuka and Shimamoto (2002) Plant CellPhysiol. 43: 130-135; Boss and Thomas (2002) Nature 416: 847-850; He etal. (2000) Transgenic Res. 9: 223-227; and Robson et al. (2001) Plant J.28: 619-631.

In yet another example, Gilmour et al. (1998) Plant J 16: 433-442, teachan Arabidopsis AP2 transcription factor, CBF1 (SEQ ID NO: 44), which,when overexpressed in transgenic plants, increases plant freezingtolerance. Jaglo et al. (2001) Plant Physiol. 127: 910-917, furtheridentified sequences in Brassica napus which encode CBF-like genes andthat transcripts for these genes accumulated rapidly in response to lowtemperature. Transcripts encoding CBF-like proteins were also found toaccumulate rapidly in response to low temperature in wheat, as well asin tomato. An alignment of the CBF proteins from Arabidopsis, B. napus,wheat, rye, and tomato revealed the presence of conserved consecutiveamino acid residues, PKK/RPAGRxKFxETRHP and DSAWR, that bracket theAP2/EREBP DNA binding domains of the proteins and distinguish them fromother members of the AP2/EREBP protein family. (See Jaglo et al. supra).

Gao et al. (2002) Plant Molec. Biol. 49: 459-471) have recentlydescribed four CBF transcription factors from Brassica napus: BNCBFs 5,7, 16 and 17. They note that the first three CBFs (GenBank AccessionNumbers AAM18958, AAM18959, and AAM18960, respectively) are very similarto Arabidopsis CBF1, whereas BNCBF17 (GenBank Accession Number AAM18961)is similar but contains two extra regions of 16 and 21 amino acids inits acidic activation domain. All four B. napus CBFs accumulate inleaves of the plants after cold-treatment, and BNCBFs 5, 7, 16accumulated after salt stress treatment. The authors concluded thatthese BNCBFs likely function in low-temperature responses in B. napus.

In a functional study of CBF genes, Hsieh et al. ((2002) Plant Physiol.129: 1086-1094) found that heterologous expression of Arabidopsis CBF1in tomato plants confers increased tolerance to chilling andconsiderable tolerance to oxidative stress, which suggested to theauthors that ectopic Arabidopsis CBF1 expression may induce severaltomato stress responsive genes to protect the plants.

Polypeptides and Polynucleotides of the Invention

The present invention provides, among other things, transcriptionfactors (TFs), and transcription factor homolog polypeptides, andisolated or recombinant polynucleotides encoding the polypeptides, ornovel sequence variant polypeptides or polynucleotides encoding novelvariants of transcription factors derived from the specific sequencesprovided here. These polypeptides and polynucleotides may be employed tomodify a plant's characteristics.

Exemplary polynucleotides encoding the polypeptides of the inventionwere identified in the Arabidopsis thaliana GenBank database usingpublicly available sequence analysis programs and parameters. Sequencesinitially identified were then further characterized to identifysequences comprising specified sequence strings corresponding tosequence motifs present in families of known transcription factors. Inaddition, further exemplary polynucleotides encoding the polypeptides ofthe invention were identified in the plant GenBank database usingpublicly available sequence analysis programs and parameters. Sequencesinitially identified were then further characterized to identifysequences comprising specified sequence strings corresponding tosequence motifs present in families of known transcription factors.Polynucleotide sequences meeting such criteria were confirmed astranscription factors.

Additional polynucleotides of the invention were identified by screeningArabidopsis thaliana and/or other plant cDNA libraries with probescorresponding to known transcription factors under low stringencyhybridization conditions. Additional sequences, including full lengthcoding sequences were subsequently recovered by the rapid amplificationof cDNA ends (RACE) procedure, using a commercially available kitaccording to the manufacturer's instructions. Where necessary, multiplerounds of RACE are performed to isolate 5′ and 3′ ends. The full-lengthcDNA was then recovered by a routine end-to-end polymerase chainreaction (PCR) using primers specific to the isolated 5′ and 3′ ends.Exemplary sequences are provided in the Sequence Listing.

The polynucleotides of the invention can be or were ectopicallyexpressed in overexpressor or knockout plants and the changes in thecharacteristic(s) or trait(s) of the plants observed. Therefore, thepolynucleotides and polypeptides can be employed to improve thecharacteristics of plants.

The polynucleotides of the invention can be or were ectopicallyexpressed in overexpressor plant cells and the changes in the expressionlevels of a number of genes, polynucleotides, and/or proteins of theplant cells observed. Therefore, the polynucleotides and polypeptidescan be employed to change expression levels of a genes, polynucleotides,and/or proteins of plants.

Mads Affecting Flowering (MAF) Transcription Factor Gene FamilyPolynucleotide and Polypeptide Sequences

Examination of the Arabidopsis genome sequence has revealed theexistence of five MADS box genes which encode proteins that are highlyrelated to FLC (Alvarez-Buylla et al. (2000a) Proc. Natl. Acad. Sci. 97:5328-5333; Ratcliffe et al. (2001) Plant Physiol. 126: 122-132). Thefirst of the genes to be analyzed, MADS AFFECTING FLOWERINGI, MAFI(which has also been referred to as FLOWERING LOCUS M, FLM, Scortecci etal. (2001) Plant J. 26: 229-236, and as AGL27, Alvarez-Buylla et al.(2000a) supra), was shown to be a floral repressor (Ratcliffe et al.(2001) supra; Scortecci et al. (2001) supra). MAF1 expression shows aless clear-cut association with the vernalization response than that ofFLC, and the gene potentially acts downstream or independently of FLCtranscription (Ratcliffe et al. (2001) supra). The functions of four FLCrelated genes were analyzed and it was demonstrated that they influencethe timing of flowering. In particular, a mechanism that preventsArabidopsis plants becoming vernalized by short periods of cold wasrevealed.

The Arabidopsis genome contains four genes, which are highly related toFLC and MAF1, arranged in a tight cluster at the bottom of chromosome 5(Ratcliffe et. al. (2001) supra). The gene cluster occupiesapproximately 22 kb and comprises At5g65050 (which corresponds to AGL31;Alvarez-Buylla et al. (2000a) supra), At5g65060, At5g65070, andAt5g65080. As shown in FIG. 6, an alignment of the full-length proteinsequences of SEQ ID NO: 1875 (FLC), SEQ ID NO: 1735 (MAF1), SEQ ID NO:568 (MAF2), SEQ ID NO: 944 (MAF3), SEQ ID NO: 946 (MAF4), and SEQ ID NO:948 (MAF5), shows that the protein sequences share a large degree ofidentity (residue identity denoted by asterisk: “*”; residue similaritydenoted by period: “.”) across the entire sequence, depending upon thepair-wise combination. This similarity suggested that the polynucleotidesequences and the encoded polypeptide sequences of MAF2, MAF3, MAF4, andMAF5 are MADS-domain family transcription factors which have functionsrelated to those of MAF1 and FLC in the regulation of flowering time(Riechmann and Meyerowitz (1997) supra).

MAF polypeptide sequences from Arabidopsis and soy were aligned usingthe CLUSTAL W(1.4) multiple sequence alignment algorithm (MACVETOR) tocreate pairwise alignments. The results are shown in Table 14.

As shown in Table 14, SEQ ID NO: 568 (MAF2; G859) has 61% identity and74% similarity to SEQ ID NO: 1875 (FLC: G1759), 76% identity and 85%similarity to SEQ ID NO: 1735 (MAF1: G157), 87% identity and 90%similarity to SEQ ID NO: 944 (MAF3: G1842), 64% identity and 77%similarity to SEQ ID NO: 946 (MAF4: G1843), 63% identity and 79%similarity to SEQ ID NO: 948 (MAF5: G1844), 34% identity and 52%similarity to SEQ ID NO: 1971 (SOY1: SOY MADS1), and 34% identity and53% similarity to SEQ ID NO: 1973 (SOY3: SOY MADS3), respectively.

In addition, Table 14 shows that SEQ ID NO: 944 (MAF3; G1842) has 60%identity and 75% similarity to SEQ ID NO: 1875 (FLC: G1759), 78%identity and 87% similarity to SEQ ID NO: 1735 (MAF1: G157), 87%identity and 90% similarity to SEQ ID NO: 568 (MAF2: G859), 64% identityand 78% similarity to SEQ ID NO: 946 (MAF4: G1843), 65% identity and 81%similarity to SEQ ID NO: 948 (MAF5: G1844), 32% identity and 43%similarity to SEQ ID NO: 1971 (SOY1: SOY MADS1), and 34% identity and56% similarity to SEQ ID NO: 1973 (SOY3: SOY MADS3), respectively.

In addition, Table 14 shows that SEQ ID NO: 946 (MAF4; G1843) has 57%identity and 74% similarity to SEQ ID NO: 1875 (FLC: G1759), 65%identity and 81% similarity to SEQ ID NO: 1735 (MAF1: G157), 64%identity and 77% similarity to SEQ ID NO: 568 (MAF2: G859), 64% identityand 78% similarity to SEQ ID NO: 944 (MAF3: G1842), 71% identity and 83%similarity to SEQ ID NO: 948 (MAF5: G1844), 35% identity and 53%similarity to SEQ ID NO: 1971 (SOY1: SOY MADS1), and 36% identity and55% similarity to SEQ ID NO: 1973 (SOY3: SOY MADS3), respectively.

In addition, Table 14 shows that SEQ ID NO: 948 (MAF5; G1844) has 53%identity and 73% similarity to SEQ ID NO: 1875 (FLC: G1759), 63%identity and 80% similarity to SEQ ID NO: 1735 (MAF1: G157), 63%identity and 79% similarity to SEQ ID NO: 568 (MAF2: G859), 65% identityand 81% similarity to SEQ ID NO: 944 (MAF3: G1842), 71% identity and 83%similarity to SEQ ID NO: 946 (MAF4: G1843), 36% identity and 54%similarity to SEQ ID NO: 1971 (SOY1: SOY MADS1), and 36% identity and56% similarity to SEQ ID NO: 1973 (SOY3: SOY MADS3), respectively.

The results show that a polypeptide of 186 amino acid residues (thelength of the shortest of the MAF sequences; Soy 1.SOY MADS1; SEQ ID NO:1971) shows at least 32% identity over the complete polypeptide sequencecompared with SEQ ID NO: 944 (MAF3: G1842).

FIGS. 7A and 7B shows the alignment between the Arabidopsis and soy MAFpolypeptide sequences produced using the CLASTAL W(1.4) multiplesequence alignment algorithm.

As shown in FIGS. 7A and 7B, a conserved domain of SEQ ID NO: 568 (MAF2,G859.pep; amino acid residues Gly2 through Ser57) has amino acidsequence identity with a conserved domain of SEQ ID NO: 1875 (FLC;G1759.pep; 83.9%), SEQ ID NO: 1735 (MAF1, G157.pep; 87.5%), SEQ ID NO:944 (MAF3, G1842.pep; 92.9%), SEQ ID NO: 946 (MAF4, G1843.pep; 76.8%),SEQ ID NO: 948 (MAF5, G1844.pep; 78.6%), SEQ ID NO:14 (SOY MADS1,Soy1.pep; 69.6%), and SEQ ID NO: 1973 (SOY MADS3, Soy3.pep; 66.1%). SOYMADS1 (SEQ ID NO:14) and SOY MADS3 (SEQ ID NO:16) are therefore soy(Glycine max) MAF homologs of the Arabidopsis SEQ ID NOs: 568, 944, 946,948, 1735, and 1875.

The conserved domain of MAF transcription factors exemplified by aminoacid residues Gly2 through Ser57 of SEQ ID NO: 568 as shown in FIG. 7Ais defined by the consensus amino acid residue sequenceGX1X1X1X2EIKRIENKSXRQX2TFXKRRXGLXXKARX3LSX2LCXXXX2AX2XX2XSXX4GX1LYXX,wherein X1 represent a basic residue such as K or R, X2 represent analiphatic residue such as V, I, L, A, or G, X3 represents an acid oramide residue such as E or Q, or D or N, X4 represents an aliphatichydroxyl residue such as T or S, and wherein X represents any amino acidresidue. An exemplary consensus sequence is SEQ ID NO: 2010.

As also shown in FIG. 7A, an additional conserved domain (amino acidresidues Ala58 through Gln74) flanks a region C-terminal to theconserved domain of amino acid residues Gly2 through Ser57 of SEQ ID NO:568. This conserved domain of SEQ ID NO: 568 (MAF2, G859.pep; amino acidresidues Ala58 through Gln74) has amino acid sequence identity with aconserved domain of SEQ ID NO: 1875 (FLC; G1759.pep; 58.8%), SEQ ID NO:1735 (MAF1, G157.pep; 76.5%), SEQ ID NO: 944 (MAF3, G1842.pep; 100%),SEQ ID NO: 946 (MAF4, G1843.pep; 59.2%), SEQ ID NO: 948 (MAF5,G1844.pep; 58.8%), SEQ ID NO: 1971 (SOY MADS1, Soy1.pep; 23.5%), and SEQID NO: 1973 (SOY MADS3, Soy3.pep; 23.5%).

As also shown in FIG. 7A, the larger conserved domain (amino acidresidues Gly2 through Gln74) of SEQ ID NO: 568. This conserved domain ofSEQ ID NO: 568 (MAF2, G859.pep; amino acid residues Gly2 through Gln74)has amino acid sequence identity with a conserved domain of SEQ ID NO:1875 (FLC; G1759.pep; 78.1%), SEQ ID NO: 1735 (MAF1, G157.pep; 84.9%),SEQ ID NO: 944 (MAF3, G1842.pep; 93.2%), SEQ ID NO: 946 (MAF4,G1843.pep; 72.6%), SEQ ID NO: 948 (MAF5, G1844.pep; 72.6%), SEQ ID NO:1971 (SOY MADS1, Soy1.pep; 65.8%), and SEQ ID NO: 1973 (SOY MADS3,Soy3.pep; 54.8%).

The larger conserved domain shown in FIG. 7A is defined by the consensusamino acid residue sequence GX1X1X1X2EIKRIENKSXRQX2TFXKRRXGLXXKARX3LSX2LCXXXX2AX2XX2XSXX4GX1LYXXXXGDXXXXX2X2XXX5XXXX, wherein X1 represent abasic residue such as K or R, X2 represent an aliphatic residue such asV, I, L, A, or G, X3 represents an acid or amide residue such as E or Q,or D or N, X4 represents an aliphatic hydroxyl residue such as T or S,X5 represents an aromatic residue such as F or Y, and wherein Xrepresents any amino acid residue. An exemplary consensus sequence isSEQ ID NO: 2011.

In addition, FIG. 7 shows that FLC, MAF1, MAF2, MAF3, MAF4, MAF5, SOYMADS1, and SOY MADS3 share three potential protein kinase Cphosphorylation sites at amino acid residues Ser15, Ser22, and Ser51;two potential protein kinase A phosphorylation sites at amino acidresidues Thr20 and Ser36; a potential CaMPKII phosphorylation site atamino acid residue Ser36; FLC, MAF1, MAF2, MAF3, MAF4, and MAF5 sharethree potential protein kinase casein kinase I phosphorylation sites atamino acid residues Y55, S116, and S129; and MAF1, MAF2, MAF3, and MAF5share three potential protein kinase casein kinase II phosphorylationsites at amino acid residue S57, S59, and S19.

Allelic variants of MAF2-5 are represented by SEQ ID NOs: 567, 1944,1946, 1948, and 1950 (MAF2 variants); SEQ ID NOs: 943, 1952, 1954, 1956,and 1958 (MAF3 variants); SEQ ID NOs: 945, 1960, 1962, 1964, and 1966(MAF4 variants); and SEQ ID NOs: 947 and 1968 (MAF5 variants).

FIG. 8 shows the effects of vernalization on expression of MAF2-5 (SEQID NOs: 567, 943, 945, and 947) in different genetic backgrounds(Arabidopsis accessions). FIG. 9 shows a schematic diagram summarizingthe responses of FLC and MAF1-5 to vernalization, and their potentialeffects on the floral transition. Arrows indicate positive interactions,blunt-ended lines denote inhibition.

MAF2-5 (SEQ ID NOs: 567, 943, 945, and 947; encoding SEQ ID NOs: 568,944, 946, and 948, respectively) are involved in regulation of thevernalization response. MAF2 (SEQ ID NO: 567) encodes a floral repressor(SEQ ID NO: 568), which participates in a previously unrecognizedmechanism that prevents the plants being vernalized by short coldperiods. MAF3 and MAF4 (SEQ ID NO: 943 and SEQ ID NO: 945, encoding SEQID NOs: 944 and 946, respectively) may have parallel roles to FLC in themaintenance of a vernalization requirement. MAF5 (SEQ ID NO: 947;encoding SEQ ID NO: 948), is activated by vernalization, and couldtherefore have an opposing role to FLC and MAF1-4.

Therefore, it was concluded that MAF2 (SEQ ID NO: 567; encoding SEQ IDNO: 568) compensates for the decrease in FLC levels that occursfollowing short cold spells, and thereby prevents a flowering responsebeing triggered. In some environments, promotion of flowering inresponse to a few days of cold weather might be advantageous. However,winter annual strains of Arabidopsis from northern latitudes haveevolved to over-winter vegetatively and commence flowering in the springonly after a sustained period of low temperature (Reeves and Coupland(2000) supra; Michaels and Amasino (2000) supra). Individual plantswithout MAF2-like activity would be more susceptible to transient coldspells in the autumn, when conditions for seed set are unfavorable.Thus, there would be likely a selective advantage for a plant to evolveMAF2 function. It would be advantageous for a plant that does not havean endogenous MAF2-like activity to be bred and/or transformed to haveMAF2-like (MAF) activity in order to be less susceptible to transientcold.

The polynucleotide sequences of SEQ ID NOs: 1970, and 1972 may bealtered such that amino acid sequences SEQ ID NOs: 1971 and 1973 (SOYMADS1 and SOY MADS3, respectively) have conservative andnon-conservative similar amino acid substitutions to create sequenceswhich can have MAF (such as MAF2-like) activity.

In general, a wide variety of applications exist for systems that eitherlengthen or shorten the time to flowering.

Accelerated Flowering:

Most modem crop varieties are the result of extensive breeding programs.Many generations of backcrossing can be required to introduce desiredtraits. Transgenic plants comprising systems that accelerate floweringcould have valuable applications in such programs. A faster generationtime can allow additional harvests of a crop to be made within a givengrowing season. With the advent of transformation systems for treespecies such as oil palm and Eucalyptus, forest biotechnology is agrowing area of interest. Acceleration of flowering, again, can reducegeneration times and make breeding programs feasible which wouldotherwise be impossible. That this is a real possibility has alreadybeen demonstrated in aspen, a tree species that usually takes 8-20 yearsto flower. Transgenic aspen that over-express the Arabidopsis LFY geneflower after only 5 months. The flowers produced by these young aspenplants, however, were sterile (Weigel and Nilsson (1995) Nature 377:495-500).

Delayed Flowering:

In species such as sugarbeet, where the vegetative parts of the plantsconstitute the crop and the reproductive tissues are discarded, it wouldbe advantageous to delay or prevent flowering. Extending vegetativedevelopment could bring about large increases in yields.

Inducible Flowering:

By regulating the expression of flowering-time controlling genes, usinginducible promoters, flowering could potentially be triggered as desired(for example, by application of a chemical inducer). This would allow,for example, flowering to be synchronized across a crop and facilitatemore efficient harvesting. Such inducible systems could be used to tunethe flowering of crop varieties to different latitudes. At present,species such as soybean and cotton are available as a series of maturitygroups that are suitable for different latitudes on the basis of theirflowering time (which is governed by day-length). A system in whichflowering could be chemically controlled would allow a singlehigh-yielding northern maturity group to be grown at any latitude. Insouthern regions such plants could be grown for longer, therebyincreasing yields, before flowering was induced. In more northern areas,the induction would be used to ensure that the crop flowers prior to thefirst winter frosts. Currently, the existence of a series of maturitygroups for different latitudes represents a major barrier to theintroduction of new valuable traits. Any trait (e.g. disease resistance,vernalization response) has to be bred into each of the differentmaturity groups separately; a laborious and costly exercise. Theavailability of single strain, which could be grown at any latitude,would therefore greatly increase the potential for introducing newtraits to crop species such as soybean and cotton.

Producing Polypeptides

The polynucleotides of the invention include sequences that encodetranscription factors and transcription factor homolog polypeptides andsequences complementary thereto, as well as unique fragments of codingsequence, or sequence complementary thereto. Such polynucleotides canbe, e.g., DNA or RNA, e.g., mRNA, cRNA, synthetic RNA, genomic DNA, cDNAsynthetic DNA, oligonucleotides, etc. The polynucleotides are eitherdouble-stranded or single-stranded, and include either, or both sense(i.e., coding) sequences and antisense (i.e., non-coding, complementary)sequences. The polynucleotides include the coding sequence of atranscription factor, or transcription factor homolog polypeptide, inisolation, in combination with additional coding sequences (e.g., apurification tag, a localization signal, as a fusion-protein, as apre-protein, or the like), in combination with non-coding sequences(e.g., introns or inteins, regulatory elements such as promoters,enhancers, terminators, and the like), and/or in a vector or hostenvironment in which the polynucleotide encoding a transcription factoror transcription factor homolog polypeptide is an endogenous orexogenous gene.

A variety of methods exist for producing the polynucleotides of theinvention. Procedures for identifying and isolating DNA clones are wellknown to those of skill in the art, and are described in, e.g., Bergerand Kimmel, Guide to Molecular Cloning Techniques, Methods inEnzymology, vol. 152 Academic Press, Inc., San Diego, Calif. (“Berger”);Sambrook et al. (1989) Molecular Cloning—A Laboratory Manual (2nd Ed.),Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., andCurrent Protocols in Molecular Biology, Ausubel et al. eds., CurrentProtocols, a joint venture between Greene Publishing Associates, Inc.and John Wiley & Sons, Inc., (supplemented through 2000) (“Ausubel”).

Alternatively, polynucleotides of the invention, can be produced by avariety of in vitro amplification methods adapted to the presentinvention by appropriate selection of specific or degenerate primers.Examples of protocols sufficient to direct persons of skill through invitro amplification methods, including the polymerase chain reaction(PCR) the ligase chain reaction (LCR), Q-beta-replicase amplificationand other RNA polymerase mediated techniques (e.g., NASBA), e.g., forthe production of the homologous nucleic acids of the invention arefound in Berger (supra), Sambrook (supra), and Ausubel (supra), as wellas Mullis et al. (1987) PCR Protocols A Guide to Methods andApplications (Innis et al. eds) Academic Press Inc. San Diego, Calif.(1990) (Innis). Improved methods for cloning in vitro amplified nucleicacids are described in Wallace et al. U.S. Pat. No. 5,426,039. Improvedmethods for amplifying large nucleic acids by PCR are summarized inCheng et al. (1994) Nature 369: 684-685 and the references citedtherein, in which PCR amplicons of up to 40 kb are generated. One ofskill will appreciate that essentially any RNA can be converted into adouble stranded DNA suitable for restriction digestion, PCR expansionand sequencing using reverse transcriptase and a polymerase. See, e.g.,Ausubel, Sambrook and Berger, all supra.

Alternatively, polynucleotides and oligonucleotides of the invention canbe assembled from fragments produced by solid-phase synthesis methods.Typically, fragments of up to approximately 100 bases are individuallysynthesized and then enzymatically or chemically ligated to produce adesired sequence, e.g., a polynucleotide encoding all or part of atranscription factor. For example, chemical synthesis using thephosphoramidite method is described, e.g., by Beaucage et al. (1981)Tetrahedron Letters 22: 1859-1869; and Matthes et al. (1984) EMBO J. 3:801-805. According to such methods, oligonucleotides are synthesized,purified, annealed to their complementary strand, ligated and thenoptionally cloned into suitable vectors. And if so desired, thepolynucleotides and polypeptides of the invention can be custom orderedfrom any of a number of commercial suppliers.

Homologous Sequences

Sequences homologous, i.e., that share significant sequence identity orsimilarity, to those provided in the Sequence Listing, derived fromArabidopsis thaliana or from other plants of choice, are also an aspectof the invention. Homologous sequences can be derived from any plantincluding monocots and dicots and in particular agriculturally importantplant species, including but not limited to, crops such as soybean,wheat, corn (maize), potato, cotton, rice, rape, oilseed rape (includingcanola), sunflower, alfalfa, clover, sugarcane, and turf; or fruits andvegetables, such as banana, blackberry, blueberry, strawberry, andraspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant,grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers,pineapple, pumpkin, spinach, squash, sweet corn, tobacco, tomato,tomatillo, watermelon, rosaceous fruits (such as apple, peach, pear,cherry and plum) and vegetable brassicas (such as broccoli, cabbage,cauliflower, Brussels sprouts, and kohlrabi). Other crops, includingfruits and vegetables, whose phenotype can be changed and which comprisehomologous sequences include barley; rye; millet; sorghum; currant;avocado; citrus fruits such as oranges, lemons, grapefruit andtangerines, artichoke, cherries; nuts such as the walnut and peanut;endive; leek; roots such as arrowroot, beet, cassaya, turnip, radish,yam, and sweet potato; and beans. The homologous sequences may also bederived from woody species, such pine, poplar and eucalyptus, or mint orother labiates. In addition, homologous sequences may be derived fromplants that are evolutionarily-related to crop plants, but which may nothave yet been used as crop plants. Examples include deadly nightshade(Atropa belladona), related to tomato; jimson weed (Datura strommium),related to peyote; and teosinte (Zea species), related to corn (maize).

Orthologs and Paralogs

Homologous sequences as described above can comprise orthologous orparalogous sequences. Several different methods are known by those ofskill in the art for identifying and defining these functionallyhomologous sequences. Three general methods for defining orthologs andparalogs are described; an ortholog or paralog, including equivalogs,may be identified by one or more of the methods described below.

Orthologs and paralogs are evolutionarily related genes that havesimilar sequence and similar functions. Orthologs are structurallyrelated genes in different species that are derived by a speciationevent. Paralogs are structurally related genes within a single speciesthat are derived by a duplication event.

Within a single plant species, gene duplication may cause two copies ofa particular gene, giving rise to two or more genes with similarsequence and often similar function known as paralogs. A paralog istherefore a similar gene formed by duplication within the same species.Paralogs typically cluster together or in the same lade (a group ofsimilar genes) when a gene family phylogeny is analyzed using programssuch as CLUSTAL (Thompson et al. (1994) Nucleic Acids Res. 22:4673-4680; Higgins et al. (1996) Methods Enzymol. 266: 383-402). Groupsof similar genes can also be identified with pair-wise BLAST analysis(Feng and Doolittle (1987) J. Mol. Evol. 25: 351-360). For example, adade of very similar MADS domain transcription factors from Arabidopsisall share a common function in flowering time (Ratcliffe et al. (2001)Plant Physiol. 126: 122-132), and a group of very similar AP2 domaintranscription factors from Arabidopsis are involved in tolerance ofplants to freezing (Gilmour et al. (1998) Plant J. 16: 433-442).Analysis of groups of similar genes with similar function that fallwithin one lade can yield sub-sequences that are particular to theclade. These sub-sequences, known as consensus sequences, can not onlybe used to define the sequences within each lade, but define thefunctions of these genes; genes within a lade may contain paralogoussequences, or orthologous sequences that share the same function (seealso, for example, Mount (2001), in Bioinformatics: Sequence and GenomeAnalysis, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.,page 543.)

Speciation, the production of new species from a parental species, canalso give rise to two or more genes with similar sequence and similarfunction. These genes, termed orthologs, often have an identicalfunction within their host plants and are often interchangeable betweenspecies without losing function. Because plants have common ancestors,many genes in any plant species will have a corresponding orthologousgene in another plant species. Once a phylogenic tree for a gene familyof one species has been constructed using a program such as CLUSTAL(Thompson et al. (1994) Nucleic Acids Res. 22: 4673-4680; Higgins et al.(1996) supra) potential orthologous sequences can be placed into thephylogenetic tree and their relationship to genes from the species ofinterest can be determined. Orthologous sequences can also be identifiedby a reciprocal BLAST strategy. Once an orthologous sequence has beenidentified, the function of the ortholog can be deduced from theidentified function of the reference sequence.

Results from alignment analysis using SSERCH of 2,079 protein domainshave shown that alignment of two unrelated polypeptide sequences have anupper threshold of percentage amino acid residue identity and that thispercentage identity threshold decreases with the length of the alignment(Brenner et al. (1998) Proc. Natl. Acad. Sci. 95: 6073-6078). Brennersuggested that it is probably necessary for alignments to be at least 70residues in length before 40% identity is a reasonable threshold, suchthat the likelihood of two sequences being related is not due to chance.Furthermore, Brenner et al. (1998, supra) showed that two sequenceswhich have 30% identity is a reliable threshold for the database of2,079 domains only for sequence alignments of at least 150 residues.Brenner et al. disclosed a chart which showed where the percentageidentity and alignment length limits of such unrelated sequences cluster(see Brenner et al. (1998, supra), FIG. 3, page 6075). From such achart, an alignment of two polypeptide sequences may be plotted toascertain whether they fall within or without the region of unrelatedsequences. For example, two polypeptide sequences which are aligned over73 residues having at least 50% sequence identity are above thethreshold for which they might be considered unrelated and thereforeare, more likely than not, related to one another.

In addition, Bork has shown that there is a 70% accuracy rate forbioinformatics-based predictions in general, and a 90% accuracy rate forthe prediction of functional features by homology (Bork (2000) GenomeRes. 10:398-400; Table 1 on page 399).

Transcription factor gene sequences are conserved across diverseeukaryotic species lines (Goodrich et al. (1993) Cell 75: 519-530; Linet al. (1991) Nature 353: 569-571; Sadowski et al. (1988) Nature 335:563-564).et al. Plants are no exception to this observation; diverseplant species possess transcription factors that have similar sequencesand functions.

Orthologous genes from different organisms have highly conservedfunctions, and very often essentially identical functions (Lee et al.(2002) Genome Res. 12: 493-502; Remm et al. (2001) J. Mol. Biol. 314:1041-1052). Paralogous genes, which have diverged through geneduplication, may retain similar functions of the encoded proteins. Insuch cases, paralogs can be used interchangeably with respect to certainembodiments of the instant invention (for example, transgenic expressionof a coding sequence). An example of such highly related paralogs is theCBF family, with three well-defined members in Arabidopsis and at leastone ortholog in Brassica napus (SEQ ID NOs: 44, 46, 48, and 1941,respectively), all of which control pathways involved in both freezingand drought stress (Gilmour et al. (1998) Plant J. 16: 433-442; Jaglo etal. (1998) Plant Physiol. 127: 910-917).

The following references represent a small sampling of the many studiesthat demonstrate that conserved transcription factor genes from diversespecies are likely to function similarly (i.e., regulate similar targetsequences and control the same traits), and that transcription factorsmay be transformed into diverse species to confer or improve traits.

(1) The Arabidopsis NPR1 gene regulates systemic acquired resistance(SAR); over-expression of NPR1 leads to enhanced resistance-inArabidopsis. When either Arabidopsis NPR1 or the rice NPR1 ortholog wasoverexpressed in rice (which, as a monocot, is diverse fromArabidopsis), challenge with the rice bacterial blight pathogenXanthomonas oryzae pv. Oryzae, the transgenic plants displayed enhancedresistance (Chern et al. (2001) Plant J. 27: 101-113). NPR1 acts throughactivation of expression of transcription factor genes, such as TGA2(Fan and Dong (2002) Plant Cell 14: 1377-1389).

(2) E2F genes are involved in transcription of plant genes forproliferating cell nuclear antigen (PCNA). Plant E2Fs share a highdegree of similarity in amino acid sequence between monocots and dicots,and are even similar to the conserved domains of the animal E2Fs. Suchconservation indicates a functional similarity between plant and animalE2Fs. E2F transcription factors that regulate meristem development actthrough common cis-elements, and regulate related (PCNA) genes (Kosugiand Ohashi, (2002) Plant J. 29: 45-59).

(3) The ABI5 gene (abscisic acid (ABA) insensitive 5) encodes a basicleucine zipper factor required for ABA response in the seed andvegetative tissues. Co-transformation experiments with AB15 cDNAconstructs in rice protoplasts resulted in specific transactivation ofthe ABA-inducible wheat, Arabidopsis, bean, and barley promoters. Theseresults demonstrate that sequentially similar AB15 transcription factorsare key targets of a conserved ABA signaling pathway in diverse plants.(Gampala et al. (2001) J. Biol. Chem. 277: 1689-1694).

(4) Sequences of three Arabidopsis GAMYB-like genes were obtained on thebasis of sequence similarity to GAMYB genes from barley, rice, and L.temulentum. These three Arabadopsis genes were determined to encodetranscription factors (AtMYB33, AtMYB65, and AtMYB101) and couldsubstitute for a barley GAMYB and control alpha-amylase expression(Gocal et al. (2001) Plant Physiol. 127: 16821693).

(5) The floral control gene LEAFY from Arabidopsis can dramaticallyaccelerate flowering in numerous dictoyledonous plants. Constitutiveexpression of Arabidopsis LEAFY also caused early flowering intransgenic rice (a monocot), with a heading date that was 26-34 daysearlier than that of wild-type plants. These observations indicate thatfloral regulatory genes from Arabidopsis are useful tools for headingdate improvement in cereal crops (He et al. (2000) Transgenic Res. 9:223-227).

(6) Bioactive gibberellins (GAs) are essential endogenous regulators ofplant growth. GA signaling tends to be conserved across the plantkingdom. GA signaling is mediated via GAI, a nuclear member of the GRASfamily of plant transcription factors. Arabidopsis GAI has been shown tofunction in rice to inhibit gibberellin response pathways (Fu et al.(2001) Plant Cell 13: 1791-1802).

(7) The Arabidopsis gene SUPERMAN (SUP), encodes a putativetranscription factor that maintains the boundary between stamens andcarpels. By over-expressing Arabidopsis SUP in rice, the effect of thegene's presence on whorl boundaries was shown to be conserved. Thisdemonstrated that SUP is a conserved regulator of floral whorlboundaries and affects cell proliferation (Nandi et al. (2000) Curr.Biol. 10: 215-218).

(8) Maize, petunia and Arabidopsis myb transcription factors thatregulate flavonoid biosynthesis are very genetically similar and affectthe same trait in their native species, therefore sequence and functionof these myb transcription factors correlate with each other in thesediverse species (Borevitz et al. (2000) Plant Cell 12: 2383-2394).

(9) Wheat reduced height-1 (Rht-B1/Rht-D1) and maize dwarf-8 (d8) genesare orthologs of the Arabidopsis gibberellin insensitive (GAI) gene.Both of these genes have been used to produce dwarf grain varieties thathave improved grain yield. These genes encode proteins that resemblenuclear transcription factors and contain an SH2-like domain, indicatingthat phosphotyrosine may participate in gibberellin signaling.Transgenic rice plants containing a mutant GAI allele from Arabidopsishave been shown to produce reduced responses to gibberellin and aredwarfed, indicating that mutant GAI orthologs could be used to increaseyield in a wide range of crop species (Peng et al. (1999) Nature 400:256-261).

Transcription factors that are homologous to the listed sequences willtypically share, in at least one conserved domain, at least about 70%amino acid sequence identity, and with regard to zinc fingertranscription factors, at least about 50% amino acid sequence identity.More closely related transcription factors can share at least about 70%,or about 75% or about 80% or about 90% or about 95% or about 98% or moresequence identity with the listed sequences, or with the listedsequences but excluding or outside a known consensus sequence orconsensus DNA-binding site, or with the listed sequences excluding oneor all conserved domain. Factors that are most closely related to thelisted sequences share, e.g., at least about 85%, about 90% or about 95%or more % sequence identity to the listed sequences, or to the listedsequences but excluding or outside a known consensus sequence orconsensus DNA-binding site or outside one or all conserved domain. Atthe nucleotide level, the sequences will typically share at least about40% nucleotide sequence identity, preferably at least about 50%, about60%, about 70% or about 80% sequence identity, and more preferably about85%, about 90%, about 95% or about 97% or more sequence identity to oneor more of the listed sequences, or to a listed sequence but excludingor outside a known consensus sequence or consensus DNA-binding site, oroutside one or all conserved domain. The degeneracy of the genetic codeenables major variations in the nucleotide sequence of a polynucleotidewhile maintaining the amino acid sequence of the encoded protein.Conserved domains within a transcription factor family may exhibit ahigher degree of sequence homology, such as at least 65% amino acidsequence identity including conservative substitutions, and preferablyat least 80% sequence identity, and more preferably at least 85%, or atleast about 86%, or at least about 87%, or at least about 88%, or atleast about 90%, or at least about 95%, or at least about 98% sequenceidentity. Transcription factors that are homologous to the listedsequences should share at least 30%, or at least about 60%, or at leastabout 75%, or at least about 80%, or at least about 90%, or at leastabout 95% amino acid sequence identity over the entire length of thepolypeptide or the homolog.

Percent identity can be determined electronically, e.g., by using theMEGALIGN program (DNASTAR, Inc. Madison, Wis.). The MEGALIGN program cancreate alignments between two or more sequences according to differentmethods, for example, the clustal method. (See, for example, Higgins andSharp (1988) Gene 73: 237-244.) The clustal algorithm groups sequencesinto clusters by examining the distances between all pairs. The clustersare aligned pairwise and then in groups. Other alignment algorithms orprograms may be used, including FASTA, BLAST, or ENTREZ, FASTA andBLAST, and which may be used to calculate percent similarity. These areavailable as a part of the GCG sequence analysis package (University ofWisconsin, Madison, Wis.), and can be used with or without defaultsettings. ENTREZ is available through the National Center forBiotechnology Information. In one embodiment, the percent identity oftwo sequences can be determined by the GCG program with a gap weight of1, e.g., each amino acid gap is weighted as if it were a single aminoacid or nucleotide mismatch between the two sequences (see U.S. Pat. No.6,262,333).

Other techniques for alignment are described in Doolittle, R. F. (1996)Methods in Enzymology: Computer Methods for Macromolecular SequenceAnalysis, vol. 266, Academic Press, Orlando, Fla., USA. Preferably, analignment program that permits gaps in the sequence is utilized to alignthe sequences. The Smith-Waterman is one type of algorithm that permitsgaps in sequence alignments (see Shpaer (1997) Methods Mol. Biol. 70:173-187). Also, the GAP program using the Needleman and Wunsch alignmentmethod can be utilized to align sequences. An alternative searchstrategy uses MPSRCH software, which runs on a MASPAR computer. MPSRCHuses a Smith-Waterman algorithm to score sequences on a massivelyparallel computer. This approach improves ability to pick up distantlyrelated matches, and is especially tolerant of small gaps and nucleotidesequence errors. Nucleic acid-encoded amino acid sequences can be usedto search both protein and DNA databases.

The percentage similarity between two polypeptide sequences, e.g.,sequence A and sequence B, is calculated by dividing the length ofsequence A, minus the number of gap residues in sequence A, minus thenumber of gap residues in sequence B, into the sum of the residuematches between sequence A and sequence B, times one hundred. Gaps oflow or of no similarity between the two amino acid sequences are notincluded in determining percentage similarity. Percent identity betweenpolynucleotide sequences can also be counted or calculated by othermethods known in the art, e.g., the Jotun Hein method. (See, e.g., Hein(1990) Methods Enzymol. 183: 626-645.) Identity between sequences canalso be determined by other methods known in the art, e.g., by varyinghybridization conditions (see US Patent Application No. 20010010913).

The percent identity between two conserved domains of a transcriptionfactor DNA-binding domain consensus polypeptide sequence can be as lowas 16%, as exemplified in the case of GATA1 family of eukaryoticCys₂/Cys₂-type zinc finger transcription factors. The DNA-binding domainconsensus polypeptide sequence of the GATA1 family is CX₂CX₁₇CX₂C, whereX is any amino acid residue. (See, for example, Takatsuji, supra.) Otherexamples of such conserved consensus polypeptide sequences with lowoverall percent sequence identity are well known to those of skill inthe art.

Thus, the invention provides methods for identifying a sequence similaror paralogous or orthologous or homologous to one or morepolynucleotides as noted herein, or one or more target polypeptidesencoded by the polynucleotides, or otherwise noted herein and mayinclude linking or associating a given plant phenotype or gene functionwith a sequence. In the methods, a sequence database is provided(locally or across an internet or intranet) and a query is made againstthe sequence database using the relevant sequences herein and associatedplant phenotypes or gene functions.

In addition, one or more polynucleotide sequences or one or morepolypeptides encoded by the polynucleotide sequences may be used tosearch against a BLOCKS (Bairoch et al. (1997) Nucleic Acids Res. 25:217-221), PFAM, and other databases which contain previously identifiedand annotated motifs, sequences and gene functions. Methods that searchfor primary sequence patterns with secondary structure gap penalties(Smith et al. (1992) Protein Engineering 5: 35-51) as well as algorithmssuch as Basic Local Alignment Search Tool (BLAST; Altschul (1993) J.Mol. Evol. 36: 290-300; Altschul et al. (1990) supra), BLOCKS (Henikoffand Henikoff (1991) Nucleic Acids Res. 19: 6565-6572), Hidden MarkovModels (HMM; Eddy (1996) Curr. Opin. Str. Biol. 6: 361-365; Sonnhammeret al. (1997) Proteins 28: 405-420), and the like, can be used tomanipulate and analyze polynucleotide and polypeptide sequences encodedby polynucleotides. These databases, algorithms and other methods arewell known in the art and are described in Ausubel et al. (1997; ShortProtocols in Molecular Biology, John Wiley & Sons, New York, N.Y., unit7.7) and in Meyers (1995; Molecular Biology and Biotechnology, WileyVCH, New York, N.Y., p 856-853).

Furthermore, methods using manual alignment of sequences similar orhomologous to one or more polynucleotide sequences or one or morepolypeptides encoded by the polynucleotide sequences may be used toidentify regions of similarity and conserved domains. Such manualmethods are well-known of those of skill in the art and can include, forexample, comparisons of tertiary structure between a polypeptidesequence encoded by a polynucleotide which comprises a known functionwith a polypeptide sequence encoded by a polynucleotide sequence whichhas a function not yet determined. Such examples of tertiary structuremay comprise predicted alpha helices, beta-sheets, amphipathic helices,leucine zipper motifs, zinc finger motifs, proline-rich regions,cysteine repeat motifs, and the like.

Orthologs and paralogs of presently disclosed transcription factors maybe cloned using compositions provided by the present invention accordingto methods well known in the art. cDNAs can be cloned using mRNA from aplant cell or tissue that expresses one of the present transcriptionfactors. Appropriate mRNA sources may be identified by interrogatingNorthern blots with probes designed from the present transcriptionfactor sequences, after which a library is prepared from the mRNAobtained from a positive cell or tissue. Transcription factor-encodingcDNA is then isolated using, for example, PCR, using primers designedfrom a presently disclosed transcription factor gene sequence, or byprobing with a partial or complete cDNA or with one or more sets ofdegenerate probes based on the disclosed sequences. The cDNA library maybe used to transform plant cells. Expression of the cDNAs of interest isdetected using, for example, methods disclosed herein such asmicroarrays, Northern blots, quantitative PCR, or any other techniquefor monitoring changes in expression. Genomic clones may be isolatedusing similar techniques to those.

Identifying Polynucleotides or Nucleic Acids by Hybridization

Polynucleotides homologous to the sequences illustrated in the SequenceListing and tables can be identified, e.g., by hybridization to eachother under stringent or under highly stringent conditions. Singlestranded polynucleotides hybridize when they associate based on avariety of well characterized physical-chemical forces, such as hydrogenbonding, solvent exclusion, base stacking and the like. The stringencyof a hybridization reflects the degree of sequence identity of thenucleic acids involved, such that the higher the stringency, the moresimilar are the two polynucleotide strands. Stringency is influenced bya variety of factors, including temperature, salt concentration andcomposition, organic and non-organic additives, solvents, etc. presentin both the hybridization and wash solutions and incubations (and numberthereof), as described in more detail in the references cited above.

Encompassed by the invention are polynucleotide sequences that arecapable of hybridizing to the claimed polynucleotide sequences,including any of the transcription factor polynucleotides within theSequence Listing, and fragments thereof under various conditions ofstringency (See, for example, Wahl and Berger (1987) Methods Enzymol.152: 399-407; and Kimmel (1987) Methods Enzymol. 152: 507-511). Inaddition to the nucleotide sequences listed in Tables 4 and 5, fulllength cDNA, orthologs, and paralogs of the present nucleotide sequencesmay be identified and isolated using well-known methods. The cDNAlibraries, orthologs, and paralogs of the present nucleotide sequencesmay be screened using hybridization methods to determine their utilityas hybridization target or amplification probes.

With regard to hybridization, conditions that are highly stringent, andmeans for achieving them, are well known in the art. See, for example,Sambrook et al. (1989) “Molecular Cloning: A Laboratory Manual, (2nded., Cold Spring Harbor Laboratory); Berger and Kimmel, eds., (1987)“Guide to Molecular Cloning Techniques”, In Methods in Enzymology: 152:467-469; and Anderson and Young (1985) “Quantitative FilterHybridisation.” In: Hames and Higgins, ed., Nucleic Acid Hybridisation,A Practical Approach. Oxford, IRL Press, 73-111.

Stability of DNA duplexes is affected by such factors as basecomposition, length, and degree of base pair mismatch. Hybridizationconditions may be adjusted to allow DNAs of different sequencerelatedness to hybridize. The melting temperature (T_(m)) is defined asthe temperature when 50% of the duplex molecules have dissociated intotheir constituent single strands. The melting temperature of a perfectlymatched duplex, where the hybridization buffer contains formamide as adenaturing agent, may be estimated by the following equations:

(I) DNA-DNA:T _(m)(° C.)−81.5+16.6(log [Na+])+0.41(% G+C)−0.62(% formamide)−500/L(II) DNA-RNA:T _(m)(° C.)=79.8+18.5(log [Na+])+0.58(% G+C)+0.12(% G+C)²−0.5(%formamide)−820/L(III) RNA-RNA:T _(m)(° C.)=79.8+18.5(log [Na+])+0.58(% G+C)+0.12(% G+C)²−0.35(%formamide)−820/L

-   -   where L is the length of the duplex formed, [Na+] is the molar        concentration of the sodium ion in the hybridization or washing        solution, and % G+C is the percentage of (guanine+cytosine)        bases in the hybrid. For imperfectly matched hybrids,        approximately 1° C. is required to reduce the melting        temperature for each 1% mismatch.

Hybridization experiments are generally conducted in a buffer of pHbetween 6.8 to 7.4, although the rate of hybridization is nearlyindependent of pH at ionic strengths likely to be used in thehybridization buffer (Anderson et al. (1985) supra). In addition, one ormore of the following may be used to reduce non-specific hybridization:sonicated salmon sperm DNA or another non-complementary DNA, bovineserum albumin, sodium pyrophosphate, sodium dodecylsulfate (SDS),polyvinylpyrrolidone, ficoll and Denhardt's solution. Dextran sulfateand polyethylene glycol 6000 act to exclude DNA from solution, thusraising the effective probe DNA concentration and the hybridizationsignal within a given unit of time. In some instances, conditions ofeven greater stringency may be desirable or required to reducenon-specific and/or background hybridization. These conditions may becreated with the use of higher temperature, lower ionic strength andhigher concentration of a denaturing agent such as formamide.

Stringency conditions can be adjusted to screen for moderately similarfragments such as homologous sequences from distantly related organisms,or to highly similar fragments such as genes that duplicate functionalenzymes from closely related organisms. The stringency can be adjustedeither during the hybridization step or in the post-hybridizationwashes. Salt concentration, formamide concentration, hybridizationtemperature and probe lengths are variables that can be used to alterstringency (as described by the formula above). As a general guidelineshigh stringency is typically performed at T_(m)−5° C. to T_(m)−20° C.,moderate stringency at T_(m)−20° C. to T_(m)−35° C. and low stringencyat T_(m)−35° C. to T_(m)−50° C. for duplex >150 base pairs.Hybridization may be performed at low to moderate stringency (25-50° C.below T_(m)), followed by post-hybridization washes at increasingstringencies. Maximum rates of hybridization in solution are determinedempirically to occur at T_(m)−25° C. for DNA-DNA duplex and T_(m)−15° C.for RNA-DNA duplex. Optionally, the degree of dissociation may beassessed after each wash step to determine the need for subsequent,higher stringency wash steps.

High stringency conditions may be used to select for nucleic acidsequences with high degrees of identity to the disclosed sequences. Anexample of stringent hybridization conditions obtained in a filter-basedmethod such as a Southern or northern blot for hybridization ofcomplementary nucleic acids that have more than 100 complementaryresidues is about 5° C. to 20° C. lower than the thermal melting point(T_(m)) for the specific sequence at a defined ionic strength and pH.Conditions used for hybridization may include about 0.02 M to about 0.15M sodium chloride, about 0.5% to about 5% casein, about 0.02% SDS orabout 0.1% N-laurylsarcosine, about 0.001 M to about 0.03 M sodiumcitrate, at hybridization temperatures between about 50° C. and about70° C. More preferably, high stringency conditions are about 0.02 Msodium chloride, about 0.5% casein, about 0.02% SDS, about 0.001 Msodium citrate, at a temperature of about 50° C. Nucleic acid moleculesthat hybridize under stringent conditions will typically hybridize to aprobe based on either the entire DNA molecule or selected portions,e.g., to a unique subsequence, of the DNA.

Stringent salt concentration will ordinarily be less than about 750 mMNaCl and 75 mM trisodium citrate. Increasingly stringent conditions maybe obtained with less than about 500 mM NaCl and 50 mM trisodiumcitrate, to even greater stringency with less than about 250 mM NaCl and25 mM trisodium citrate. Low stringency hybridization can be obtained inthe absence of organic solvent, e.g., formamide, whereas high stringencyhybridization may be obtained in the presence of at least about 35%formamide, and more preferably at least about 50% formamide. Stringenttemperature conditions will ordinarily include temperatures of at leastabout 300 C, more preferably of at least about 37° C., and mostpreferably of at least about 42° C. with formamide present. Varyingadditional parameters, such as hybridization time, the concentration ofdetergent, e.g., sodium dodecyl sulfate (SDS) and ionic strength, arewell known to those skilled in the art. Various levels of stringency areaccomplished by combining these various conditions as needed.

The washing steps that follow hybridization may also vary in stringency;the post-hybridization wash steps primarily determine hybridizationspecificity, with the most critical factors being temperature and theionic strength of the final wash solution. Wash stringency can beincreased by decreasing salt concentration or by increasing temperature.Stringent salt concentration for the wash steps will preferably be lessthan about 30 mM NaCl and 3 mM trisodium citrate, and most preferablyless than about 15 mM NaCl and 1.5 mM trisodium citrate.

Thus, hybridization and wash conditions that may be used to bind andremove polynucleotides with less than the desired homology to thenucleic acid sequences or their complements that encode the presenttranscription factors include, for example:

-   -   6×SSCat65° C.;    -   50% formamide, 4×SSC at 42° C.; or    -   0.5×SSC, 0.1% SDS at 650 C;    -   with, for example, two wash steps of 10-30 minutes each. Useful        variations on these conditions will be readily apparent to those        skilled in the art.

A person of skill in the art would not expect substantial variationamong polynucleotide species encompassed within the scope of the presentinvention because the highly stringent conditions set forth in the aboveformulae yield structurally similar polynucleotides.

If desired, one may employ wash steps of even greater stringency,including about 0.2×SSC, 0.1% SDS at 65° C. and washing twice, each washstep being about 30 min, or about 0.1×SSC, 0.1% SDS at 65° C. andwashing twice for 30 min. The temperature for the wash solutions willordinarily be at least about 250 C, and for greater stringency at leastabout 42° C. Hybridization stringency may be increased further by usingthe same conditions as in the hybridization steps, with the washtemperature raised about 3° C. to about 5° C., and stringency may beincreased even further by using the same conditions except the washtemperature is raised about 6° C. to about 9° C. For identification ofless closely related homolog, wash steps may be performed at a lowertemperature, e.g., 500 C.

An example of a low stringency wash step employs a solution andconditions of at least 25° C. in 30 mM NaCl, 3 mM trisodium citrate, and0.1% SDS over 30 min. Greater stringency may be obtained at 42° C. in 15mM NaCl, with 1.5 mM trisodium citrate, and 0.1% SDS over 30 min. Evenhigher stringency wash conditions are obtained at 65° C.-68° C. in asolution of 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1% SDS. Washprocedures will generally employ at least two final wash steps.Additional variations on these conditions will be readily apparent tothose skilled in the art (see, for example, U.S. Patent Application No.20010010913).

Stringency conditions can be selected such that an oligonucleotide thatis perfectly complementary to the coding oligonucleotide hybridizes tothe coding oligonucleotide with at least about a 5-10× higher signal tonoise ratio than the ratio for hybridization of the perfectlycomplementary oligonucleotide to a nucleic acid encoding a transcriptionfactor known as of the filing date of the application. It may bedesirable to select conditions for a particular assay such that a highersignal to noise ratio, that is, about 15× or more, is obtained.Accordingly, a subject nucleic acid will hybridize to a unique codingoligonucleotide with at least a 2× or greater signal to noise ratio ascompared to hybridization of the coding oligonucleotide to a nucleicacid encoding known polypeptide. The particular signal will depend onthe label used in the relevant assay, e.g., a fluorescent label, acolorimetric label, a radioactive label, or the like. Labeledhybridization or PCR probes for detecting related polynucleotidesequences may be produced by oligolabeling, nick translation,end-labeling, or PCR amplification using a labeled nucleotide.

Identifying Polynucleotides or Nucleic Acids with Expression Libraries

In addition to hybridization methods, transcription factor homologpolypeptides can be obtained by screening an expression library usingantibodies specific for one or more transcription factors. With theprovision herein of the disclosed transcription factor, andtranscription factor homolog nucleic acid sequences, the encodedpolypeptide(s) can be expressed and purified in a heterologousexpression system (e.g., E. coli) and used to raise antibodies(monoclonal or polyclonal) specific for the polypeptide(s) in question.Antibodies can also be raised against synthetic peptides derived fromtranscription factor, or transcription factor homolog, amino acidsequences. Methods of raising antibodies are well known in the art andare described in Harlow and Lane (1988), Antibodies: A LaboratoryManual, Cold Spring Harbor Laboratory, New York. Such antibodies canthen be used to screen an expression library produced from the plantfrom which it is desired to clone additional transcription factorhomologs, using the methods described above. The selected cDNAs can beconfirmed by sequencing and enzymatic activity.

Encompassed by the invention are polynucleotide sequences that arecapable of hybridizing to the claimed polynucleotide sequences, and, inparticular, to those shown in SEQ ID NO: 567, SEQ ID NO: 943, SEQ ID NO:945, SEQ ID NO: 947, SEQ ID NO: 1734, SEQ ID NO: 1874, SEQ ID NO: 1014,SEQ ID NO: 1970, SEQ ID NO: 1972, SEQ ID NO: 1944, SEQ ID NO: 1946, SEQID NO: 1948, SEQ ID NO: 1950, SEQ ID NO: 1952, SEQ ID NO: 1954, SEQ IDNO: 1956, SEQ ID NO: 1958, SEQ ID NO: 1960, SEQ ID NO: 1962, SEQ ID NO:1964, SEQ ID NO: 1966, SEQ ID NO: 1968, SEQ ID NO: 1970, SEQ ID NO:1972, and fragments thereof under various conditions of stringency.(See, e.g., Wahl and Berger (1987) Methods Enzymol. 152: 399-407; Kimmel(1987) Methods Enzymol. 152: 507-511.) Estimates of homology areprovided by either DNA-DNA or DNA-RNA hybridization under conditions ofstringency as is well understood by those skilled in the art (Hames andHiggins, Eds. (1985) Nucleic Acid Hybridisation, IRL Press, Oxford,U.K.). Stringency conditions can be adjusted to screen for moderatelysimilar fragments, such as homologous sequences from distantly relatedorganisms, to highly similar fragments, such as genes that duplicatefunctional enzymes from closely related organisms. Post-hybridizationwashes determine stringency conditions.

Sequence Variations

It will readily be appreciated by those of skill in the art, that any ofa variety of polynucleotide sequences are capable of encoding thetranscription factors and transcription factor homolog polypeptides ofthe invention. Due to the degeneracy of the genetic code, many differentpolynucleotides can encode identical and/or substantially similarpolypeptides in addition to those sequences illustrated in the SequenceListing. Nucleic acids having a sequence that differs from the sequencesshown in the Sequence Listing, or complementary sequences, that encodefunctionally equivalent peptides (i.e., peptides having some degree ofequivalent or similar biological activity) but differ in sequence fromthe sequence shown in the Sequence Listing due to degeneracy in thegenetic code, are also within the scope of the invention.

Altered polynucleotide sequences encoding polypeptides include thosesequences with deletions, insertions, or substitutions of differentnucleotides, resulting in a polynucleotide encoding a polypeptide withat least one functional characteristic of the instant polypeptides.Included within this definition are polymorphisms which may or may notbe readily detectable using a particular oligonucleotide probe of thepolynucleotide encoding the instant polypeptides, and improper orunexpected hybridization to allelic variants, with a locus other thanthe normal chromosomal locus for the polynucleotide sequence encodingthe instant polypeptides.

Allelic variant refers to any of two or more alternative forms of a geneoccupying the same chromosomal locus. Allelic variation arises naturallythrough mutation, and may result in phenotypic polymorphism withinpopulations. Gene mutations can be silent (i.e., no change in theencoded polypeptide) or may encode polypeptides having altered aminoacid sequence. The term allelic variant is also used herein to denote aprotein encoded by an allelic variant of a gene. Splice variant refersto alternative forms of RNA transcribed from a gene. Splice variationarises naturally through use of alternative splicing sites within atranscribed RNA molecule, or less commonly between separatelytranscribed RNA molecules, and may result in several mRNAs transcribedfrom the same gene. Splice variants may encode polypeptides havingaltered amino acid sequence. The term splice variant is also used hereinto denote a protein encoded by a splice variant of an mRNA transcribedfrom a gene.

Those skilled in the art would recognize that, for example, G682, SEQ IDNO: 468, represents a single transcription factor; allelic variation andalternative splicing may be expected to occur. Allelic variants of SEQID NO: 467 can be cloned by probing cDNA or genomic libraries fromdifferent individual organisms according to standard procedures. Allelicvariants of the DNA sequence shown in SEQ ID NO: 467, including thosecontaining silent mutations and those in which mutations result in aminoacid sequence changes, are within the scope of the present invention, asare proteins which are allelic variants of SEQ ID NO: 468. cDNAsgenerated from alternatively spliced mRNAs, which retain the propertiesof the transcription factor are included within the scope of the presentinvention, as are polypeptides encoded by such cDNAs and mRNAs. Allelicvariants and splice variants of these sequences can be cloned by probingcDNA or genomic libraries from different individual organisms or tissuesaccording to standard procedures known in the art (see U.S. Pat. No.6,388,064).

Those skilled in the art would recognize that G859, SEQ ID NOs: 567 and568, represents a single MAF transcription factor; allelic variation andalternative splicing may be expected to occur. Allelic variants of SEQID NO: 567 can be cloned by probing cDNA or genomic libraries fromdifferent individual organisms according to standard procedures. Allelicvariants of the DNA sequence shown in SEQ ID NO: 567, including thosecontaining silent mutations and those in which mutations result in aminoacid sequence changes, are within the scope of the present invention, asare proteins which are allelic variants of SEQ ID NO: 568. cDNAsgenerated from alternatively spliced mRNAs, which retain the propertiesof the MAF transcription factor are included within the scope of thepresent invention, as are polypeptides encoded by such cDNAs and mRNAs.Allelic variants and splice variants of these sequences can be cloned byprobing cDNA or genomic libraries from different individual organisms ortissues according to standard procedures known in the art (see U.S. Pat.No. 6,388,064).

An example of allelic variants or alternatively spliced variants of SEQID NO: 567 (G859) are SEQ ID NO: 1946 (G859.1), SEQ ID NO: 1944(G859.3), SEQ ID NO: 1948 (G859.4), and SEQ ID NO: 1950 (G849.5). Thesevariants encode SEQ ID NOs: 1947, 1945, 1949, and 1951, respectively,which are variants of SEQ ID NO: 568.

Thus, in addition to the sequences set forth in the Sequence Listing andin Table 4, the invention also encompasses related nucleic acidmolecules that include allelic or splice variants of SEQ ID NO: 567, SEQID NO: 943, SEQ ID NO: 945, SEQ ID NO: 947, SEQ ID NO: 1734, SEQ ID NO:1874, SEQ ID NO: 1014, SEQ ID NO: 1970, SEQ ID NO: 1972, SEQ ID NO:1944, SEQ ID NO: 1946, SEQ ID NO: 1948, SEQ ID NO: 1950, SEQ ID NO:1952, SEQ ID NO: 1954, SEQ ID NO: 1956, SEQ ID NO: 1958, SEQ ID NO:1960, SEQ ID NO: 1962, SEQ ID NO: 1964, SEQ ID NO: 1966, or SEQ ID NO:1968, and include sequences which are complementary to any of the abovenucleotide sequences. Related nucleic acid molecules also includenucleotide sequences encoding a polypeptide comprising or consistingessentially of a substitution, modification, addition and/or deletion ofone or more amino acid residues compared to the polypeptide as set forthin any of SEQ ID NO: 568, SEQ ID NO: 944, SEQ ID NO: 946, SEQ ID NO:948, SEQ ID NO: 1735, SEQ ID NO: 1875, SEQ ID NO: 1971, SEQ ID NO: 1973,SEQ ID NO: 1945, SEQ ID NO: 1947, SEQ ID NO: 1949, SEQ ID NO: 1951, SEQID NO: 1953, SEQ ID NO: 1955, SEQ ID NO: 1957, SEQ ID NO: 1959, SEQ IDNO: 1961, SEQ ID NO: 1963, SEQ ID NO: 1965, SEQ ID NO: 1967, or SEQ IDNO: 1969. Such related polypeptides may comprise, for example, additionsand/or deletions of one or more N-linked or O-linked glycosylationsites, or an addition and/or a deletion of one or more cysteineresidues.

Thus, in addition to the transcription factor sequences set forth in theSequence Listing, the invention also encompasses related nucleic acidmolecules that include allelic or splice variants of the transcriptionfactor sequences in the Sequence Listing, and include sequences that arecomplementary to any of the above nucleotide sequences. Related nucleicacid molecules also include nucleotide sequences encoding a polypeptidecomprising or consisting essentially of a substitution, modification,addition and/or deletion of one or more amino acid residues compared tothe polypeptide as set forth in any of SEQ ID NO: 2N, wherein N=1-SEQ IDNO: 2N, wherein N=1-480, SEQ ID NO: 2N-1, where N=857-970, or SEQ ID NO:989, 990, 991, 1001, 1002, 1012, 1018, 1021, 1022, 1025, 1027, 1028,1029, 1034, 1050, 1051, 1072, 1073, 1074, 1075, 1076, 1091, 1092, 1093,1094, 1095, 1109, 1110, 1111, 1112, 1150, 1165, 1166, 1167, 1168, 1169,1189, 1190, 1191, 1197, 1198, 1199, 1213, 1214, 1215, 1216, 1226, 1227,1233, 1239, 1246, 1247, 1258, 1259, 1269, 1307, 1308, 1309, 1310, 1323,1329, 1330, 1331, 1332, 1338, 1339, 1340, 1361, 1362, 1373, 1374, 1375,1384, 1389, 1390, 1391, 1396, 1411, 1412, 1413, 1414, 1424, 1435, 1436,1437, 1448, 1456, 1457, 1458, 1459, 1460, 1472, 1483, 1484, 1500, 1508,1510, 1511, 1520, 1538, 1539, 1540, 1541, 1542, 1543, 1563, 1564, 1565,1566, 1567, 1568, 1569, 1582, 1583, 1594, 1611, 1612, 1618, 1619, 1620,1626, 1627, 1635, 1636, 1640, 1641, 1655, 1656, 1657, 1658, 1672, 1673,1680, 1682, 1686, 1687, 1688, 1689, 1696, 1702 or 1703. Such relatedpolypeptides may comprise, for example, additions and/or deletions ofone or more N-linked or O-linked glycosylation sites, or an additionand/or a deletion of one or more cysteine residues.

For example, Table 1 illustrates, e.g., that the codons AGC, AGT, TCA,TCC, TCG, and TCT all encode the same amino acid: serine. Accordingly,at each position in the sequence where there is a codon encoding serine,any of the above trinucleotide sequences can be used without alteringthe encoded polypeptide. TABLE 1 Amino acid Possible Codons Alanine AlaA GCA CCC GCG GGU Gysteine Cys C TGC TGT Aspartic acid Asp D GAC GATGlutamic acid Glu E GAA GAG Phenylalanine Phe F TTC TTT Glycine Gly GGGA GGC GGG GGT Histidine His H CAC CAT Isoleucine Ile I ATA ATC ATTLysine Lys K AAA AAG Leucine Leu L TTA TTG CTA CTC CTG CTT MethionineMet M ATG Asparagine Asn N AAC AAT Proline Pro P CCA CCC CCG CCTGlutamine Gln Q CAA CAG Arginine Arg R AGA AGG CGA CGC CGG CGT SerineSer S AGC AGT TCA TCC TCG TCT Tryeonine Thr T ACA ACC ACG ACT Valine ValV GTA GTC GTG GTT Tryptophan Trp W TGG Tyrosine Tyr Y TAC TAT

Sequence alterations that do not change the amino acid sequence encodedby the polynucleotide are termed “silent” variations. With the exceptionof the codons ATG and TGG, encoding methionine and tryptophan,respectively, any of the possible codons for the same amino acid can besubstituted by a variety of techniques, e.g., site-directed mutagenesis,available in the art. Accordingly, any and all such variations of asequence selected from the above table are a feature of the invention.

In addition to silent variations, other conservative variations thatalter one, or a few amino acids in the encoded polypeptide, can be madewithout altering the function of the polypeptide, these conservativevariants are, likewise, a feature of the invention.

For example, substitutions, deletions and insertions introduced into thesequences provided in the Sequence Listing, are also envisioned by theinvention. Such sequence modifications can be engineered into a sequenceby site-directed mutagenesis (Wu (ed.) Methods Enzymol. (1993) vol. 217,Academic Press) or the other methods noted below. Amino acidsubstitutions are typically of single residues;

insertions usually will be on the order of about from 1 to 10 amino acidresidues; and deletions will range about from 1 to 30 residues. Inpreferred embodiments, deletions or insertions are made in adjacentpairs, e.g., a deletion of two residues or insertion of two residues.Substitutions, deletions, insertions or any combination thereof can becombined to arrive at a sequence. The mutations that are made in thepolynucleotide encoding the transcription factor should not place thesequence out of reading frame and should not create complementaryregions that could produce secondary mRNA structure. Preferably, thepolypeptide encoded by the DNA performs the desired function.

Conservative substitutions are those in which at least one residue inthe amino acid sequence has been removed and a different residueinserted in its place. Such substitutions generally are made inaccordance with the Table 2 when it is desired to maintain the activityof the protein. Table 2 shows amino acids which can be substituted foran amino acid in a protein and which are typically regarded asconservative substitutions. TABLE 2 Conservative Residue SubstitutionsAla Ser Arg Lys Asn Gln; His Asp Glu Gln Asn Cys Ser Glu Asp Gly Pro HisAsn; Gln Ile Leu, Val Leu Ile; Val Lys Arg; Gln Met Leu; Ile Phe Met;Leu; Tyr Ser Thr; Gly Thr Ser; Val Trp Tyr Tyr Trp; Phe Val Ile; Leu

Similar substitutions are those in which at least one residue in theamino acid sequence has been removed and a different residue inserted inits place. Such substitutions generally are made in accordance with theTable 3 when it is desired to maintain the activity of the protein.Table 3 shows amino acids which can be substituted for an amino acid ina protein and which are typically regarded as structural and functionalsubstitutions. For example, a residue in column 1 of Table 3 may besubstituted with a residue in column 2; in addition, a residue in column2 of Table 3 may be substituted with the residue of column 1. TABLE 3Residue Similar Substitutions Ala Ser; Thr; Gly; Val; Leu; Ile Arg Lys;His; Gly Asn Gln; His; Gly; Ser; Thr Asp Glu, Ser; Thr Gln Asn; Ala CysSer; Gly Glu Asp Gly Pro; Arg His Asn; Gln; Tyr; Phe; Lys; Arg Ile Ala;Leu; Val; Gly; Met Leu Ala; Ile; Val; Gly; Met Lys Arg; His; Gln; Gly;Pro Met Leu; Ile; Phe Phe Met; Leu; Tyr; Trp; His; Val; Ala Ser Thr;Gly; Asp; Ala; Val; Ile; His Thr Ser; Val; Ala; Gly Trp Tyr; Phe; HisTyr Trp; Phe; His Val Ala; Ile; Leu; Gly; Thr; Ser; Glu

Substitutions that are less conservative than those in Table 2 can beselected by picking residues that differ more significantly in theireffect on maintaining (a) the structure of the polypeptide backbone inthe area of the substitution, for example, as a sheet or helicalconformation, (b) the charge or hydrophobicity of the molecule at thetarget site, or (c) the bulk of the side chain. The substitutions whichin general are expected to produce the greatest changes in proteinproperties will be those in which (a) a hydrophilic residue, e.g., serylor threonyl, is substituted for (or by) a hydrophobic residue, e.g.,leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine orproline is substituted for (or by) any other residue; (c) a residuehaving an electropositive side chain, e.g., lysyl, arginyl, or histidyl,is substituted for (or by) an electronegative residue, e.g., glutamyl oraspartyl; or (d) a residue having a bulky side chain, e.g.,phenylalanine, is substituted for (or by) one not having a side chain,e.g., glycine.

Further Modifying Sequences of the Invention—Mutation/Forced Evolution

In addition to generating silent or conservative substitutions as noted,above, the present invention optionally includes methods of modifyingthe sequences of the Sequence Listing. In the methods, nucleic acid orprotein modification methods are used to alter the given sequences toproduce new sequences and/or to chemically or enzymatically modify givensequences to change the properties of the nucleic acids or proteins.

Thus, in one embodiment, given nucleic acid sequences are modified,e.g., according to standard mutagenesis or artificial evolution methodsto produce modified sequences. The modified sequences may be createdusing purified natural polynucleotides isolated from any organism or maybe synthesized from purified compositions and chemicals using chemicalmeans well know to those of skill in the art. For example, Ausubel,supra, provides additional details on mutagenesis methods. Artificialforced evolution methods are described, for example, by Stemmer (1994)Nature 370: 389-391, Stemmer (1994) Proc. Natl. Acad. Sci. 91:10747-10751, and U.S. Pat. Nos. 5,811,238, 5,837,500, and 6,242,568.Methods for engineering synthetic transcription factors and otherpolypeptides are described, for example, by Zhang et al. (2000) J. Biol.Chem. 275: 33850-33860, Liu et al. (2001) J. Biol. Chem. 276:11323-11334, and Isalan et al. (2001) Nature Biotechnol. 19: 656-660.Many other mutation and evolution methods are also available andexpected to be within the skill of the practitioner.

Similarly, chemical or enzymatic alteration of expressed nucleic acidsand polypeptides can be performed by standard methods. For example,sequence can be modified by addition of lipids, sugars, peptides,organic or inorganic compounds, by the inclusion of modified nucleotidesor amino acids, or the like. For example, protein modificationtechniques are illustrated in Ausubel, supra. Further details onchemical and enzymatic modifications can be found herein. Thesemodification methods can be used to modify any given sequence, or tomodify any sequence produced by the various mutation and artificialevolution modification methods noted herein.

Accordingly, the invention provides for modification of any givennucleic acid by mutation, evolution, chemical or enzymatic modification,or other available methods, as well as for the products produced bypracticing such methods, e.g., using the sequences herein as a startingsubstrate for the various modification approaches.

For example, optimized coding sequence containing codons preferred by aparticular prokaryotic or eukaryotic host can be used e.g., to increasethe rate of translation or to produce recombinant RNA transcripts havingdesirable properties, such as a longer half-life, as compared withtranscripts produced using a non-optimized sequence. Translation stopcodons can also be modified to reflect host preference. For example,preferred stop codons for Saccharomyces cerevisiae and mammals are TAAand TGA, respectively. The preferred stop codon for monocotyledonousplants is TGA, whereas insects and E. coli prefer to use TAA as the stopcodon.

The polynucleotide sequences of the present invention can also beengineered in order to alter a coding sequence for a variety of reasons,including but not limited to, alterations which modify the sequence tofacilitate cloning, processing and/or expression of the gene product.For example, alterations are optionally introduced using techniqueswhich are well known in the art, e.g., site-directed mutagenesis, toinsert new restriction sites, to alter glycosylation patterns, to changecodon preference, to introduce splice sites, etc.

Furthermore, a fragment or domain derived from any of the polypeptidesof the invention can be combined with domains derived from othertranscription factors or synthetic domains to modify the biologicalactivity of a transcription factor. For instance, a DNA-binding domainderived from a transcription factor of the invention can be combinedwith the activation domain of another transcription factor or with asynthetic activation domain. A transcription activation domain assistsin initiating transcription from a DNA-binding site. Examples includethe transcription activation region of VP16 or GAL4 (Moore et al. (1998)Proc. Natl. Acad. Sci. 95: 376-381; Aoyama et al. (1995) Plant Cell 7:1773-1785), peptides derived from bacterial sequences (Ma and Ptashne(1987) Cell 51: 113-119) and synthetic peptides (Giniger and Ptashne(1987) Nature 330: 670-672).

Expression and Modification of Polypeptides

Typically, polynucleotide sequences of the invention are incorporatedinto recombinant DNA (or RNA) molecules that direct expression ofpolypeptides of the invention in appropriate host cells, transgenicplants, in vitro translation systems, or the like. Due to the inherentdegeneracy of the genetic code, nucleic acid sequences which encodesubstantially the same or a functionally equivalent amino acid sequencecan be substituted for any listed sequence to provide for cloning andexpressing the relevant homolog.

The transgenic plants of the present invention comprising recombinantpolynucleotide sequences are generally derived from parental plants,which may themselves be non-transformed (or non-transgenic) plants.These transgenic plants may either have a transcription factor gene“knocked out” (for example, with a genomic insertion by homologousrecombination, an antisense or ribozyme construct) or expressed to anormal or wild-type extent. However, overexpressing transgenic “progeny”plants will exhibit greater mRNA levels, wherein the mRNA encodes atranscription factor, that is, a DNA-binding protein that is capable ofbinding to a DNA regulatory sequence and inducing transcription, andpreferably, expression of a plant trait gene. Preferably, the mRNAexpression level will be at least three-fold greater than that of theparental plant, or more preferably at least ten-fold greater mRNA levelscompared to said parental plant, and most preferably at least fifty-foldgreater compared to said parental plant.

Vectors, Promoters, and Expression Systems

The present invention includes recombinant constructs comprising one ormore of the nucleic acid sequences herein. The constructs typicallycomprise a vector, such as a plasmid, a cosmid, a phage, a virus (e.g.,a plant virus), a bacterial artificial chromosome (BAC), a yeastartificial chromosome (YAC), or the like, into which a nucleic acidsequence of the invention has been inserted, in a forward or reverseorientation. In a preferred aspect of this embodiment, the constructfurther comprises regulatory sequences, including, for example, apromoter, operably linked to the sequence. Large numbers of suitablevectors and promoters are known to those of skill in the art, and arecommercially available.

General texts that describe molecular biological techniques usefulherein, including the use and production of vectors, promoters and manyother relevant topics, include Berger, Sambrook, supra and Ausubel,supra. Any of the identified sequences can be incorporated into acassette or vector, e.g., for expression in plants. A number ofexpression vectors suitable for stable transformation of plant cells orfor the establishment of transgenic plants have been described includingthose described in Weissbach and Weissbach (1989) Methods for PlantMolecular Biology, Academic Press, and Gelvin et al. (1990) PlantMolecular Biology Manual, Kluwer Academic Publishers. Specific examplesinclude those derived from a Ti plasmid of Agrobacterium tumefaciens, aswell as those disclosed by Herrera-Estrella et al. (1983) Nature 303:209, Bevan (1984) Nucleic Acids Res. 12: 8711-8721, Klee (1985)Bio/Technology 3: 637-642, for dicotyledonous plants.

Alternatively, non-Ti vectors can be used to transfer the DNA intomonocotyledonous plants and cells by using free DNA delivery techniques.Such methods can involve, for example, the use of liposomes,electroporation, microprojectile bombardment, silicon carbide whiskers,and viruses. By using these methods transgenic plants such as wheat,rice (Christou (1991) Bio/Technology 9: 957-962) and corn (Gordon-Kamm(1990) Plant Cell 2: 603-618) can be produced. An immature embryo canalso be a good target tissue for monocots for direct DNA deliverytechniques by using the particle gun (Weeks et al. (1993) Plant Physiol.102: 1077-1084; Vasil (1993) Bio/Technology 10: 667-674; Wan and Lemeaux(1994) Plant Physiol. 104: 37-48, and for Agrobacterium-mediated DNAtransfer (Ishida et al. (1996) Nature Biotechnol. 14: 745-750).

Typically, plant transformation vectors include one or more cloned plantcoding sequence (genomic or cDNA) under the transcriptional control of5′ and 3′ regulatory sequences and a dominant selectable marker. Suchplant transformation vectors typically also contain a promoter (e.g., aregulatory region controlling inducible or constitutive,environmentally-or developmentally-regulated, or cell- ortissue-specific expression), a transcription initiation start site, anRNA processing signal (such as intron splice sites), a transcriptiontermination site, and/or a polyadenylation signal.

A potential utility for the transcription factor polynucleotidesdisclosed herein is the isolation of promoter elements from these genesthat can be used to program expression in plants of any genes. Eachtranscription factor gene disclosed herein is expressed in a uniquefashion, as determined by promoter elements located upstream of thestart of translation, and additionally within an intron of thetranscription factor gene or downstream of the termination codon of thegene. As is well known in the art, for a significant portion of genes,the promoter sequences are located entirely in the region directlyupstream of the start of translation. In such cases, typically thepromoter sequences are located within 2.0 kb of the start oftranslation, or within 1.5 kb of the start of translation, frequentlywithin 1.0 kb of the start of translation, and sometimes within 0.5 kbof the start of translation.

The promoter sequences can be isolated according to methods known to oneskilled in the art.

Examples of constitutive plant promoters which can be useful forexpressing the TF sequence include: the cauliflower mosaic virus (CaMV)35S promoter, which confers constitutive, high-level expression in mostplant tissues (see, e.g., Odell et al. (1985) Nature 313: 810-812); thenopaline synthase promoter (An et al. (1988) Plant Physiol. 88:547-552); and the octopine synthase promoter (Fromm et al. (1989) PlantCell 1: 977-984).

A variety of plant gene promoters that regulate gene expression inresponse to environmental, hormonal, chemical, developmental signals,and in a tissue-active manner can be used for expression of a TFsequence in plants. Choice of a promoter is based largely on thephenotype of interest and is determined by such factors as tissue (e.g.,seed, fruit, root, pollen, vascular tissue, flower, carpel, etc.),inducibility (e.g., in response to wounding, heat, cold, drought, light,pathogens, etc.), timing, developmental stage, and the like. Numerousknown promoters have been characterized and can favorably be employed topromote expression of a polynucleotide of the invention in a transgenicplant or cell of interest. For example, tissue specific promotersinclude: seed-specific promoters (such as the napin, phaseolin or DC3promoter described in U.S. Pat. No. 5,773,697), fruit-specific promotersthat are active during fruit ripening (such as the dru 1 promoter (U.S.Pat. No. 5,783,393), or the 2A11 promoter (U.S. Pat. No. 4,943,674) andthe tomato polygalacturonase promoter (Bird et al. (1988) Plant Mol.Biol. 11: 651-662), root-specific promoters, such as those disclosed inU.S. Pat. Nos. 5,618,988, 5,837,848 and 5,905,186, pollen-activepromoters such as PTA29, PTA26 and PTA13 (U.S. Pat. No. 5,792,929),promoters active in vascular tissue (Ringli and Keller (1998) Plant Mol.Biol. 37: 977-988), flower-specific (Kaiser et al. (1995) Plant Mol.Biol. 28: 231-243), pollen (Baerson et al. (1994) Plant Mol. Biol. 26:1947-1959), carpels (Ohl et al. (1990) Plant Cell 2: 837-848), pollenand ovules (Baerson et al. (1993) Plant Mol. Biol. 22: 255-267),auxin-inducible promoters (such as that described in van der Kop et al.(1999) Plant Mol. Biol. 39: 979-990 or Baumann et al. (1999) Plant Cell11: 323-334), cytokinin-inducible promoter (Guevara-Garcia (1998) PlantMol. Biol. 38: 743-753), promoters responsive to gibberellin (Shi et al.(1998) Plant Mol. Biol. 38: 1053-1060, Willmott et al. (1998) 38:817-825) and the like. Additional promoters are those that elicitexpression in response to heat (Ainley et al. (1993) Plant Mol. Biol.22: 13-23), light (e.g., the pea rbcS-3A promoter, Kuhlemeier et al.(1989) Plant Cell 1: 471478, and the maize rbcS promoter, Schaffner andSheen (1991) Plant Cell 3: 997-1012); wounding (e.g., wunI, Siebertz etal. (1989) Plant Cell 1: 961-968); pathogens (such as the PR-1 promoterdescribed in Buchel et al. (1999) Plant Mol. Biol. 40: 387-396, and thePDF1.2 promoter described in Manners et al. (1998) Plant Mol. Biol. 38:1071-1080), and chemicals such as methyl jasmonate or salicylic acid(Gatz (1997) Annu. Rev. Plant Physiol. Plant Mol. Biol. 48: 89-108). Inaddition, the timing of the expression can be controlled by usingpromoters such as those acting at senescence (Gan and Amasino (1995)Science 270: 1986-1988); or late seed development (Odell et al. (1994)Plant Physiol. 106: 447-458).

Plant expression vectors can also include RNA processing signals thatcan be positioned within, upstream or downstream of the coding sequence.In addition, the expression vectors can include additional regulatorysequences from the 3′-untranslated region of plant genes, e.g., a 3′terminator region to increase mRNA stability of the mRNA, such as thePI-II terminator region of potato or the octopine or nopaline synthase3′ terminator regions.

Additional Expression Elements

Specific initiation signals can aid in efficient translation of codingsequences. These signals can include, e.g., the ATG initiation codon andadjacent sequences. In cases where a coding sequence, its initiationcodon and upstream sequences are inserted into the appropriateexpression vector, no additional translational control signals may beneeded. However, in cases where only coding sequence (e.g., a matureprotein coding sequence), or a portion thereof, is inserted, exogenoustranscriptional control signals including the ATG initiation codon canbe separately provided. The initiation codon is provided in the correctreading frame to facilitate transcription. Exogenous transcriptionalelements and initiation codons can be of various origins, both naturaland synthetic. The efficiency of expression can be enhanced by theinclusion of enhancers appropriate to the cell system in use.

Expression Hosts

The present invention also relates to host cells which are transducedwith vectors of the invention, and the production of polypeptides of theinvention (including fragments thereof) by recombinant techniques. Hostcells are genetically engineered (i.e., nucleic acids are introduced,e.g., transduced, transformed or transfected) with the vectors of thisinvention, which may be, for example, a cloning vector or an expressionvector comprising the relevant nucleic acids herein. The vector isoptionally a plasmid, a viral particle, a phage, a naked nucleic acid,etc. The engineered host cells can be cultured in conventional nutrientmedia modified as appropriate for activating promoters, selectingtransformants, or amplifying the relevant gene. The culture conditions,such as temperature, pH and the like, are those previously used with thehost cell selected for expression, and will be apparent to those skilledin the art and in the references cited herein, including, Sambrook,supra and Ausubel, supra.

The host cell can be a eukaryotic cell, such as a yeast cell, or a plantcell, or the host cell can be a prokaryotic cell, such as a bacterialcell. Plant protoplasts are also suitable for some applications. Forexample, the DNA fragments are introduced into plant tissues, culturedplant cells or plant protoplasts by standard methods includingelectroporation (Fromm et al. (1985) Proc. Natl. Acad. Sci. 82:5824-5828, infection by viral vectors such as cauliflower mosaic virus(CaMV) (Hohn et al. (1982) Molecular Biology of plant Tumors AcademicPress, New York, N.Y., pp. 549-560; U.S. Pat. No. 4,407,956), highvelocity ballistic penetration by small particles with the nucleic acideither within the matrix of small beads or particles, or on the surface(Klein et al. (1987) Nature 327: 70-73), use of pollen as vector (WO85/01856), or use of Agrobacterium tumefaciens or A. rhizogenes carryinga T-DNA plasmid in which DNA fragments are cloned. The T-DNA plasmid istransmitted to plant cells upon infection by Agrobacterium tumefaciens,and a portion is stably integrated into the plant genome (Horsch et al.(1984) Science 233: 496-498; Fraley et al. (1983) Proc. Natl. Acad. Sci.80: 4803-4807).

The cell can include a nucleic acid of the invention that encodes apolypeptide, wherein the cell expresses a polypeptide of the invention.The cell can also include vector sequences, or the like. Furthermore,cells and transgenic plants that include any polypeptide or nucleic acidabove or throughout this specification, e.g., produced by transductionof a vector of the invention, are an additional feature of theinvention.

For long-term, high-yield production of recombinant proteins, stableexpression can be used. Host cells transformed with a nucleotidesequence encoding a polypeptide of the invention are optionally culturedunder conditions suitable for the expression and recovery of the encodedprotein from cell culture. The protein or fragment thereof produced by arecombinant cell may be secreted, membrane-bound, or containedintracellularly, depending on the sequence and/or the vector used. Aswill be understood by those of skill in the art, expression vectorscontaining polynucleotides encoding mature proteins of the invention canbe designed with signal sequences which direct secretion of the maturepolypeptides through a prokaryotic or eukaryotic cell membrane.

Modified Amino Acid Residues

Polypeptides of the invention may contain one or more modified aminoacid residues. The presence of modified amino acids may be advantageousin, for example, increasing polypeptide half-life, reducing polypeptideantigenicity or toxicity, increasing polypeptide storage stability, orthe like. Amino acid residue(s) are modified, for example,co-translationally or post-translationally during recombinant productionor modified by synthetic or chemical means.

Non-limiting examples of a modified amino acid residue includeincorporation or other use of acetylated amino acids, glycosylated aminoacids, sulfated amino acids, prenylated (e.g., farnesylated,geranylgeranylated) amino acids, PEG modified (e.g., “PEGylated”) aminoacids, biotinylated amino acids, carboxylated amino acids,phosphorylated amino acids, etc. References adequate to guide one ofskill in the modification of amino acid residues are replete throughoutthe literature.

The modified amino acid residues may prevent or increase affinity of thepolypeptide for another molecule, including, but not limited to,polynucleotide, proteins, carbohydrates, lipids and lipid derivatives,and other organic or synthetic compounds.

Identification of Additional Factors

A transcription factor provided by the present invention can also beused to identify additional endogenous or exogenous molecules that canaffect a phentoype or trait of interest. On the one hand, such moleculesinclude organic (small or large molecules) and/or inorganic compoundsthat affect expression of (i.e., regulate) a particular transcriptionfactor. Alternatively, such molecules include endogenous molecules thatare acted upon either at a transcriptional level by a transcriptionfactor of the invention to modify a phenotype as desired. For example,the transcription factors can be employed to identify one or moredownstream genes that are subject to a regulatory effect of thetranscription factor. In one approach, a transcription factor ortranscription factor homolog of the invention is expressed in a hostcell, e.g., a transgenic plant cell, tissue or explant, and expressionproducts, either RNA or protein, of likely or random targets aremonitored, e.g., by hybridization to a microarray of nucleic acid probescorresponding to genes expressed in a tissue or cell type of interest,by two-dimensional gel electrophoresis of protein products, or by anyother method known in the art for assessing expression of gene productsat the level of RNA or protein. Alternatively, a transcription factor ofthe invention can be used to identify promoter sequences (such asbinding sites on DNA sequences) involved in the regulation of adownstream target. After identifying a promoter sequence, interactionsbetween the transcription factor and the promoter sequence can bemodified by changing specific nucleotides in the promoter sequence orspecific amino acids in the transcription factor that interact with thepromoter sequence to alter a plant trait. Typically, transcriptionfactor DNA-binding sites are identified by gel shift assays. Afteridentifying the promoter regions, the promoter region sequences can beemployed in double-stranded DNA arrays to identify molecules that affectthe interactions of the transcription factors with their promoters(Bulyk et al. (1999) Nature Biotechnol. 17: 573-577).

The identified transcription factors are also useful to identifyproteins that modify the activity of the transcription factor. Suchmodification can occur by covalent modification, such as byphosphorylation, or by protein-protein (homo or -heteropolymer)interactions. Any method suitable for detecting protein-proteininteractions can be employed. Among the methods that can be employed areco-immunoprecipitation, cross-linking and co-purification throughgradients or chromatographic columns, and the two-hybrid yeast system.

The two-hybrid system detects protein interactions in vivo and isdescribed in Chien et al. (1991) Proc. Natl. Acad. Sci. 88: 9578-9582,and is commercially available from Clontech (Palo Alto, Calif.). In sucha system, plasmids are constructed that encode two hybrid proteins: oneconsists of the DNA-binding domain of a transcription activator proteinfused to the TF polypeptide and the other consists of the transcriptionactivator protein's activation domain fused to an unknown protein thatis encoded by a cDNA that has been recombined into the plasmid as partof a cDNA library. The DNA-binding domain fusion plasmid and the cDNAlibrary are transformed into a strain of the yeast Saccharomycescerevisiae that contains a reporter gene (e.g., lacZ) whose regulatoryregion contains the transcription activator's binding site. Eitherhybrid protein alone cannot activate transcription of the reporter gene.Interaction of the two hybrid proteins reconstitutes the functionalactivator protein and results in expression of the reporter gene, whichis detected by an assay for the reporter gene product. Then, the libraryplasmids responsible for reporter gene expression are isolated andsequenced to identify the proteins encoded by the library plasmids.After identifying proteins that interact with the transcription factors,assays for compounds that interfere with the TF protein-proteininteractions can be preformed.

Identification of Modulators

In addition to the intracellular molecules described above,extracellular molecules that alter activity or expression of atranscription factor, either directly or indirectly, can be identified.For example, the methods can entail first placing a candidate moleculein contact with a plant or plant cell. The molecule can be introduced bytopical administration, such as spraying or soaking of a plant, orincubating a plant in a solution containing the molecule, and then themolecule's effect on the expression or activity of the TF polypeptide orthe expression of the polynucleotide monitored. Changes in theexpression of the TF polypeptide can be monitored by use of polyclonalor monoclonal antibodies, gel electrophoresis or the like. Changes inthe expression of the corresponding polynucleotide sequence can bedetected by use of microarrays, Northerns, quantitative PCR, or anyother technique for monitoring changes in mRNA expression. Thesetechniques are exemplified in Ausubel et al. (eds.) Current Protocols inMolecular Biology, John Wiley & Sons (1998, and supplements through2001).Changes in the activity of the transcription factor can bemonitored, directly or indirectly, by assaying the function of thetranscription factor, for example, by measuring the expression ofpromoters known to be controlled by the transcription factor (usingpromoter-reporter constructs), measuring the levels of transcripts usingmicroarrays, Northern blots, quantitative PCR, etc. Such changes in theexpression levels can be correlated with modified plant traits and thusidentified molecules can be useful for soaking or spraying on fruit,vegetable and grain crops to modify traits in plants.

Essentially any available composition can be tested for modulatoryactivity of expression or activity of any nucleic acid or polypeptideherein. Thus, available libraries of compounds such as chemicals,polypeptides, nucleic acids and the like can be tested for modulatoryactivity. Often, potential modulator compounds can be dissolved inaqueous or organic (e.g., DMSO-based) solutions for easy delivery to thecell or plant of interest in which the activity of the modulator is tobe tested. Optionally, the assays are designed to screen large modulatorcomposition libraries by automating the assay steps and providingcompounds from any convenient source to assays, which are typically runin parallel (e.g., in microtiter formats on microplates in roboticassays).

In one embodiment, high throughput screening methods involve providing acombinatorial library containing a large number of potential compounds(potential modulator compounds). Such “combinatorial chemical libraries”are then screened in one or more assays, as described herein, toidentify those library members (particular chemical species orsubclasses) that display a desired characteristic activity. Thecompounds thus identified can serve as target compounds.

A combinatorial chemical library can be, e.g., a collection of diversechemical compounds generated by chemical synthesis or biologicalsynthesis. For example, a combinatorial chemical library such as apolypeptide library is formed by combining a set of chemical buildingblocks (e.g., in one example, amino acids) in every possible way for agiven compound length (i.e., the number of amino acids in a polypeptidecompound of a set length). Exemplary libraries include peptidelibraries, nucleic acid libraries, antibody libraries (see, e.g., Vaughnet al. (1996) Nature Biotechnol. 14: 309-314 and PCT/US96/10287),carbohydrate libraries (see, e.g., Liang et al. Science (1996) 274:1520-1522 and U.S. Pat. No. 5,593,853), peptide nucleic acid libraries(see, e.g., U.S. Pat. No. 5,539,083), and small organic moleculelibraries (see, e.g., benzodiazepines, in Baum Chem. & Engineering NewsJan. 18, 1993, page 33; isoprenoids, U.S. Pat. No. 5,569,588;thiazolidinones and metathiazanones, U.S. Pat. No. 5,549,974;pyrrolidines, U.S. Pat. Nos. 5,525,735 and 5,519,134; morpholinocompounds, U.S. Pat. No. 5,506,337) and the like.

Preparation and screening of combinatorial or other libraries is wellknown to those of skill in the art. Such combinatorial chemicallibraries include, but are not limited to, peptide libraries (see, e.g.,U.S. Pat. No. 5,010,175; Furka, (1991) Int. J. Pept. Prot. Res. 37:487-493; and Houghton et al. (1991) Nature 354: 84-88). Otherchemistries for generating chemical diversity libraries can also beused.

In addition, as noted, compound screening equipment for high-throughputscreening is generally available, e.g., using any of a number of wellknown robotic systems that have also been developed for solution phasechemistries useful in assay systems. These systems include automatedworkstations including an automated synthesis apparatus and roboticsystems utilizing robotic arms. Any of the above devices are suitablefor use with the present invention, e.g., for high-throughput screeningof potential modulators. The nature and implementation of modificationsto these devices (if any) so that they can operate as discussed hereinwill be apparent to persons skilled in the relevant art.

Indeed, entire high-throughput screening systems are commerciallyavailable. These systems typically automate entire procedures includingall sample and reagent pipetting, liquid dispensing, timed incubations,and final readings of the microplate in detector(s) appropriate for theassay. These configurable systems provide high throughput and rapidstart up as well as a high degree of flexibility and customization.Similarly, microfluidic implementations of screening are alsocommercially available.

The manufacturers of such systems provide detailed protocols the varioushigh throughput. Thus, for example, Zymark Corp provides technicalbulletins describing screening systems for detecting the modulation ofgene transcription, ligand binding, and the like. The integrated systemsherein, in addition to providing for sequence alignment and, optionally,synthesis of relevant nucleic acids, can include such screeningapparatus to identify modulators that have an effect on one or morepolynucleotides or polypeptides according to the present invention.

In some assays it is desirable to have positive controls to ensure thatthe components of the assays are working properly. At least two types ofpositive controls are appropriate. That is, known transcriptionalactivators or inhibitors can be incubated with cells or plants, forexample, in one sample of the assay, and the resulting increase/decreasein transcription can be detected by measuring the resulting increase inRNA levels and/or protein expression, for example, according to themethods herein. It will be appreciated that modulators can also becombined with transcriptional activators or inhibitors to findmodulators that inhibit transcriptional activation or transcriptionalrepression. Either expression of the nucleic acids and proteins hereinor any additional nucleic acids or proteins activated by the nucleicacids or proteins herein, or both, can be monitored.

In an embodiment, the invention provides a method for identifyingcompositions that modulate the activity or expression of apolynucleotide or polypeptide of the invention. For example, a testcompound, whether a small or large molecule, is placed in contact with acell, plant (or plant tissue or explant), or composition comprising thepolynucleotide or polypeptide of interest and a resulting effect on thecell, plant, (or tissue or explant) or composition is evaluated bymonitoring, either directly or indirectly, one or more of: expressionlevel of the polynucleotide or polypeptide, activity (or modulation ofthe activity) of the polynucleotide or polypeptide. In some cases, analteration in a plant phenotype can be detected following contact of aplant (or plant cell, or tissue or explant) with the putative modulator,e.g., by modulation of expression or activity of a polynucleotide orpolypeptide of the invention. Modulation of expression or activity of apolynucleotide or polypeptide of the invention may also be caused bymolecular elements in a signal transduction second messenger pathway andsuch modulation can affect similar elements in the same or anothersignal transduction second messenger pathway.

Subsequences

Also contemplated are uses of polynucleotides, also referred to hereinas oligonucleotides, typically having at least 12 bases, preferably atleast 15, more preferably at least 20, 30, or 50 bases, which hybridizeunder at least highly stringent (or ultra-high stringent orultra-ultra-high stringent conditions) conditions to a polynucleotidesequence described above. The polynucleotides may be used as probes,primers, sense and antisense agents, and the like, according to methodsas noted supra.

Subsequences of the polynucleotides of the invention, includingpolynucleotide fragments and oligonucleotides are useful as nucleic acidprobes and primers. An oligonucleotide suitable for use as a probe orprimer is at least about 15 nucleotides in length, more often at leastabout 18 nucleotides, often at least about 21 nucleotides, frequently atleast about 30 nucleotides, or about 40 nucleotides, or more in length.A nucleic acid probe is useful in hybridization protocols, e.g., toidentify additional polypeptide homologs of the invention, includingprotocols for microarray experiments. Primers can be annealed to acomplementary target DNA strand by nucleic acid hybridization to form ahybrid between the primer and the target DNA strand, and then extendedalong the target DNA strand by a DNA polymerase enzyme. Primer pairs canbe used for amplification of a nucleic acid sequence, e.g., by thepolymerase chain reaction (PCR) or other nucleic-acid amplificationmethods. See Sambrook, supra, and Ausubel, supra.

In addition, the invention includes an isolated or recombinantpolypeptide including a subsequence of at least about 15 contiguousamino acids encoded by the recombinant or isolated polynucleotides ofthe invention. For example, such polypeptides, or domains or fragmentsthereof, can be used as immunogens, e.g., to produce antibodies specificfor the polypeptide sequence, or as probes for detecting a sequence ofinterest. A subsequence can range in size from about 15 amino acids inlength up to and including the full length of the polypeptide.

To be encompassed by the present invention, an expressed polypeptidewhich comprises such a polypeptide subsequence performs at least onebiological function of the intact polypeptide in substantially the samemanner, or to a similar extent, as does the intact polypeptide. Forexample, a polypeptide fragment can comprise a recognizable structuralmotif or functional domain such as a DNA binding domain that activatestranscription, e.g., by binding to a specific DNA promoter region anactivation domain, or a domain for protein-protein interactions.

Production of Transgenic Plants

Modification of Traits

The polynucleotides of the invention are favorably employed to producetransgenic plants with various traits, or characteristics, that havebeen modified in a desirable manner, e.g., to improve the seedcharacteristics of a plant. For example, alteration of expression levelsor patterns (e.g., spatial or temporal expression patterns) of one ormore of the transcription factors (or transcription factor homologs) ofthe invention, as compared with the levels of the same protein found ina wild-type plant, can be used to modify a plant's traits. Anillustrative example of trait modification, improved characteristics, byaltering expression levels of a particular transcription factor isdescribed further in the Examples and the Sequence Listing.

Arabidopsis as a Model System

Arabidopsis thaliana is the object of rapidly growing attention as amodel for genetics and metabolism in plants. Arabidopsis has a smallgenome, and well-documented studies are available. It is easy to grow inlarge numbers and mutants defining important genetically controlledmechanisms are either available, or can readily be obtained. Variousmethods to introduce and express isolated homologous genes are available(see Koncz et al. eds., et al. Methods in Arabidopsis Research (1992) etal. World Scientific, New Jersey, N.J., in “Preface”). Because of itssmall size, short life cycle, obligate autogamy and high fertility,Arabidopsis is also a choice organism for the isolation of mutants andstudies in morphogenetic and development pathways, and control of thesepathways by transcription factors (Koncz supra, p. 72). A number ofstudies introducing transcription factors into A. thaliana havedemonstrated the utility of this plant for understanding the mechanismsof gene regulation and trait alteration in plants. (See, for example,Koncz supra, and U.S. Pat. No. 6,417,428).

Arabidopsis Genes in Transgenic Plants.

Expression of genes which encode transcription factors modify expressionof endogenous genes, polynucleotides, and proteins are well known in theart. In addition, transgenic plants comprising isolated polynucleotidesencoding transcription factors may also modify expression of endogenousgenes, polynucleotides, and proteins. Examples include Peng et al.(1997) et al. Genes and Development 11: 3194-3205, and Peng et al.(1999) Nature 400: 256-261. In addition, many others have demonstratedthat an Arabidopsis transcription factor expressed in an exogenous plantspecies elicits the same or very similar phenotypic response. See, forexample, Fu et al. (2001) Plant Cell 13: 1791-1802; Nandi et al. (2000)Curr. Biol. 10: 215-218; Coupland (1995) Nature 377: 482-483; and Weigeland Nilsson (1995) Nature 377: 482-500.

Homologous Genes Introduced Into Transgenic Plants.

Homologous genes that may be derived from any plant, or from any sourcewhether natural, synthetic, semi-synthetic or recombinant, and thatshare significant sequence identity or similarity to those provided bythe present invention, may be introduced into plants, for example, cropplants, to confer desirable or improved traits. Consequently, transgenicplants may be produced that comprise a recombinant expression vector orcassette with a promoter operably linked to one or more sequenceshomologous to presently disclosed sequences. The promoter may be, forexample, a plant or viral promoter.

The invention thus provides for methods for preparing transgenic plants,and for modifying plant traits. These methods include introducing into aplant a recombinant expression vector or cassette comprising afunctional promoter operably linked to one or more sequences homologousto presently disclosed sequences. Plants and kits for producing theseplants that result from the application of these methods are alsoencompassed by the present invention.

Transcription Factors of Interest for the Modification of Plant Traits

Currently, the existence of a series of maturity groups for differentlatitudes represents a major barrier to the introduction of new valuabletraits. Any trait (e.g. disease resistance) has to be bred into each ofthe different maturity groups separately, a laborious and costlyexercise. The availability of single strain, which could be grown at anylatitude, would therefore greatly increase the potential for introducingnew traits to crop species such as soybean and cotton.

For many of the specific effects, traits and utilities listed in Table 4and Table 6 that may be conferred to plants, one or more transcriptionfactor genes may be used to increase or decrease, advance or delay, orimprove or prove deleterious to a given trait. Overexpressing orsuppressing one or more genes can impart significant differences inproduction of plant products, such as different fatty acid ratios.

For example, overexpression of G720 caused a plant to become morefreezing tolerant, but knocking out the same transcription factorimparted greater susceptibility to freezing. Thus, suppressing a genethat causes a plant to be more sensitive to cold may improve a plant'stolerance of cold. More than one transcription factor gene may beintroduced into a plant, either by transforming the plant with one ormore vectors comprising two or more transcription factors, or byselective breeding of plants to yield hybrid crosses that comprise morethan one introduced transcription factor.

A listing of specific effects and utilities that the presently disclosedtranscription factor genes have on plants, as determined by directobservation and assay analysis, is provided in Table 4. Table 4 showsthe polynucleotides identified by SEQ ID NO; Mendel Gene ID No. (GID);and if the polynucleotide was tested in a transgenic assay. The firstcolumn shows the polynucleotide SEQ ID NO; the second column shows theGID; the third column shows whether the gene was overexpressed (OE) orknocked out (KO) in plant studies; the fourth column shows the trait(s)resulting from the knock out or overexpression of the polynucleotide inthe transgenic plant; the fifth column shows the category of the trait;and the sixth column (“Comment”), includes specific observations madewith respect to the polynucleotide of the first column. TABLE 4 Traits,trait categories, and effects and utilities that transcription factorgenes have on plants. SEQ ID NO: GID No. OE/KO Trait(s) CategoryObservations 1 G3 OE Size Dev and morph Small plant OE Heat Abioticstress More sensitive to heat in a growth assay 5 G5 OE Size Dev andmorph Small plant 11 G8 OE Flowering time Flowering time Late flowering13 G9 OE Root Dev and morph Increased root mass 21 G19 OE ErysipheDisease Increased tolerance to Erysiphe Hormone sensitivity Hormonesensitivity Repressed by methyl jasmonate and induced by ACC 23 G20 OESeed sterols Seed biochemistry Increase in campesterol 25 G21 OE SizeDev and morph Reduced size 27 G22 OE Sodium chloride Abiotic stressIncreased tolerance to high salt 29 G24 OE Morphology: other Dev andmorph Reduced size OE Necrosis Dev and morph Necrotic patches 31 G25 OETrichome Dev and morph Fewer trichomes at seedling stage OE FusariumDisease Expression induced by Fusarium infection 33 G26 OE Sugar sensingSugar sensing Decreased germination and growth on glucose medium 35 G27OE Morphology: other Dev and morph Abnormal development, small 37 G28 OEBotrytis Disease Increased tolerance to Botrytis OE Erysiphe DiseaseIncreased resistance to Erysiphe OE Sclerotinia Disease Increasedtolerance to Sclerotinia 39 G32 OE Leaf Dev and morph Curled leaves 41G38 OE Sugar sensing Sugar sensing Reduced germination on glucose medium49 G43 OE Sugar sensing Sugar sensing Decreased germination and growthon glucose medium 53 G46 OE Size Dev and morph Increased size OE DroughtAbiotic stress Increased tolerance to drought 55 G134 OE Flower Dev andmorph Homeotic transformation OE Cold Abiotic stress Increasedsensitivity to cold 57 G142 OE Flowering time Flowering time Earlyflowering 59 G145 OE Flowering time Flowering time Early flowering OEInflorescence Dev and morph Terminal flowers 61 G146 OE Abiotic stressAbiotic stress Better growth in low nitrogen OE Nutrient uptake Abioticstress Altered C:N sensing: reduced OE Flowering time Flowering timeanthocyanin production on high sucrose/low nitrogen Early flowering 65G153 OE Abiotic stress Abiotic stress Tolerant to low nitrogenconditions 67 G156 KO Seed Dev and morph Seed color alteration 69 G157OE Flowering time Flowering time Modest overexpression triggers earlyflowering; greater overexpression delays flowering 71 G162 OE Seed oilcontent Seed biochemistry Increased seed oil content OE Seed proteincontent Seed biochemistry Altered seed protein content 77 G171 OE ColdAbiotic stress Expression induced by cold and heat OE Heat Abioticstress 87 G180 OE Seed oil content Seed biochemistry Decreased seed oilOE Flowering time Flowering time Early flowering 89 G184 OE Floweringtime Flowering time Early flowering OE Size Dev and morph Small plant 91G185 OE Leaf glucosinolates Leaf biochemistry Increased M39481 OEFlowering time Flowering time Early flowering 95 G187 OE Morphology:other Dev and morph Variety of morphological alterations 97 G188 KOOsmotic Abiotic stress Better germination under osmotic KO FusariumDisease stress Increased susceptibility to Fusarium 101 G192 OE Seed oilcontent Seed biochemistry Decreased oil content OE Flowering timeFlowering time Late flowering 103 G194 OE Size Dev and morph Small plant105 G196 OE Sodium chloride Abiotic stress Increased tolerance to highsalt 109 G198 OE Flowering time Flowering time Late flowering 111 G201OE Seed protein content Seed biochemistry Increased seed protein contentOE Seed oil content Seed biochemistry Decreased seed oil content 115G206 OE Seed Dev and morph Large seeds 117 G207 OE Sugar sensing Sugarsensing Decreased germination on glucose KO Botrytis Disease mediumIncreased susceptibility to Botrytis 119 G208 OE Flowering timeFlowering time Early flowering 121 G211 OE Leaf insoluble sugars Leafbiochemistry Increase in xylose 123 G212 OE Trichome Dev and morphPartially to fully glabrous on adaxial surface of leaves 127 G214 OELeaf fatty acids Leaf biochemistry Increased leaf fatty acids OE Leafprenyl lipids Leaf biochemistry Increased leaf chlorophyll and OEFlowering time Flowering time carotenoids OE Seed prenyl lipids Seedbiochemistry Late flowering Increased seed lutein 135 G222 OE Seed oilcontent Seed biochemistry Decreased seed oil content OE Seed proteincontent Seed biochemistry Increased seed protein content 137 G224 OECold Abiotic stress Increased tolerance to cold OE Leaf Dev and morphAltered leaf shape OE Sugar sensing Sugar sensing Increased germinationand seedling vigor on high glucose 139 G225 OE Root Dev and morphIncreased root hairs OE Trichome Dev and morph Glabrous, lack oftrichomes OE Nutrient uptake Abiotic stress Increased tolerance tonitrogen-limited medium 141 G226 OE Nutrient uptake Abiotic stressIncreased tolerance to nitrogen-limited OE Seed protein content Seedbiochemistry medium OE Root Dev and morph Increased seed protein OETrichome Dev and morph Increased root hairs OE Sodium chloride Abioticstress Glabrous, lack of trichomes Increased tolerance to high salt 143G227 OE Flowering time Flowering time Early flowering 147 G229 OE Seedprotein content Seed biochemistry Decreased seed protein OE Seed oilcontent Seed biochemistry Increased seed oil OE Biochemistry: otherBiochem: misc Up-regulation of genes involved in secondary metabolism151 G231 OE Leaf fatty acids Leaf biochemistry Increased leafunsaturated fatty acids OE Seed protein content Seed biochemistryDecreased seed protein content OE Seed oil content Seed biochemistryIncreased seed oil content 157 G234 OE Flowering time Flowering timeLate flowering, small plant 159 G237 OE Leaf insoluble sugars Leafbiochemistry Increased leaf insoluble sugars OE Erysiphe DiseaseIncreased tolerance to Erysiphe 161 G239 OE ABA Abiotic stressExpression induced by ABA OE Drought Abiotic stress Expression inducedby drought OE Heat Abiotic stress Expression induced by heat OE Osmoticstress Abiotic stress Expression induced by osmotic stress 163 G241 OESeed oil content Seed biochemistry Decreased seed oil KO Seed proteincontent Seed biochemistry Altered seed protein content OE Sugar sensingSugar sensing Decreased germination and growth on glucose medium 165G242 OE Leaf insoluble sugars Leaf biochemistry Increased arabinose 169G247 OE Trichome Dev and morph Altered trichome distribution 171 G248 OEBotrytis Disease Increased susceptibility to Botrytis 173 G249 OEFlowering time Flowering time Late flowering OE Senescence Dev and morphDelayed senescence 179 G254 OE Sugar sensing Sugar sensing Decreasedgermination and growth on glucose medium 181 G255 OE Flowering timeFlowering time Early flowering 183 G256 OE Cold Abiotic stress Bettergermination and growth in cold 187 G258 OE Size Dev and morph Reducedsize 191 G261 OE Botrytis Disease Increased susceptibility to Botrytis193 G263 OE Sugar sensing Sugar sensing Decreased root growth on sucrosemedium, root specific expression 199 G274 OE Leaf insoluble sugars Leafbiochemistry Increased leaf arabinose 203 G280 OE Size Dev and morphReduced size OE Leaf prenyl lipids Leaf biochemistry Increased delta andgamma tocopherol 209 G291 OE Seed oil content Seed biochemistryIncreased seed oil content 215 G307 OE Leaf insoluble sugars Leafbiochemistry Altered leaf insoluble sugars 217 G308 OE Sugar sensingSugar sensing No germination on glucose medium 223 G325 OE OsmoticAbiotic stress Better germination on high sucrose and high NaCl 231 G346OE Leaf fatty acids Leaf biochemistry Altered leaf fatty acids OE Seedoil content Seed biochemistry Decreased seed oil 235 G351 OE Lightresponse Dev and morph Altered leaf orientation and light greencoloration 237 G361 OE Flowering time Flowering time Late flowering 239G374 KO Embryo lethal Dev and morph Embryo lethal 241 G378 OE ErysipheDisease Increased resistance 245 G385 OE Size Dev and morph Small plantOE Inflorescence Dev and morph Short inflorescence stems OE Leaf Dev andmorph Dark green plant 249 G390 OE Flowering time Flowering time Earlyflowering OE Architecture Dev and morph Altered shoot development 251G394 OE Cold Abiotic stress More sensitive to chilling 261 G409 OEErysiphe Disease Increased tolerance to Erysiphe 267 G418 OE PseudomonasDisease Increased tolerance OE Seed protein content Seed biochemistryDecreased seed protein content 269 G419 OE Nutrient uptake Abioticstress Increased tolerance to potassium-free medium 271 G428 OE Leafinsoluble sugars Leaf biochemistry Increased leaf insoluble sugars OELeaf Dev and morph Altered leaf shape 273 G431 OE Morphology: other Devand morph Developmental defect, sterile 275 G434 OE Flowering timeFlowering time Late flowering 277 G435 OE Leaf insoluble sugars Leafbiochemistry Increased leaf insoluble sugars 283 G438 KO ArchitectureDev and morph Reduced branching KO Stem Dev and morph Reduced lignin OELeaf Dev and morph Increased leaf size OE Leaf Dev and morph Alteredleaf shape 285 G447 OE Size Dev and morph Reduced size OE Morphology:other Dev and morph Altered cotyledon shape OE Leaf Dev and morph Darkgreen leaves 287 G456 OE Seed protein content Seed biochemistryDecreased seed protein OE Seed oil content Seed biochemistry Increasedseed oil 291 G464 OE Seed oil content Seed biochemistry Increased seedoil OE Heat Abiotic stress Better germination and growth in heat OE LeafDev and morph Altered leaf shape OE Seed protein content Seedbiochemistry Decreased seed protein content 295 G470 OE Fertility Devand morph Short stamen filaments 301 G475 OE Flowering time Floweringtime Early flowering 303 G477 OE Sclerotinia Disease Increasedsusceptibility to Sclerotinia OE Oxidative Abiotic stress Increasedsensitivity to oxidative stress 305 G482 OE Sodium chloride Abioticstress Tolerant to high salt 307 G486 OE Flowering time Flowering timeLate flowering 309 G489 OE Osmotic Abiotic stress Increased tolerance toosmotic stress 313 G502 KO Osmotic Abiotic stress Increased sensitivityto osmotic stress 317 G509 KO Seed oil content Seed biochemistry Alteredseed oil content KO Seed protein content Seed biochemistry Altered seedprotein content 327 G515 OE Morphology: other Dev and morph Lethal whenoverexpressed 331 G521 OE Leaf Dev and morph Leaf cell expansion 337G525 OE Pseudomonas Disease Increased tolerance to Pseudomonas OE Leafinsoluble sugars Leaf biochemistry Increased leaf insoluble sugars 339G526 OE Osmotic Abiotic stress Increased sensitivity to osmotic stress343 G536 OE Sugar sensing Sugar sensing Decreased germination and growthon glucose medium 345 G545 OE Sodium chloride Abiotic stress Susceptibleto high salt OE Nutrient uptake Abiotic stress Increased tolerance tophosphate-free OE Erysiphe Disease medium OE Pseudomonas DiseaseIncreased susceptibility to Erysiphe OE Fusarium Disease Increasedsusceptibility to Pseudomonas Increased susceptibility to Fusarium 355G558 OE Defense gene Disease Increased expression of defense genesexpression 357 G559 OE Architecture Dev and morph Loss of apicaldominance OE Fertility Dev and morph Reduced fertility 359 G561 OE Seedoil content Seed biochemistry Altered seed oil content OE Nutrientuptake Abiotic stress Increased tolerance to potassium-free medium 361G562 OE Flowering time Flowering time Late flowering 369 G567 OE Seedprotein content Seed biochemistry Altered seed protein content OE Seedoil content Seed biochemistry Altered seed oil content OE Sugar sensingSugar sensing Decreased seedling vigor on high glucose 371 G569 OEDefense gene Disease Decreased expression of defense genes expression375 G571 KO Senescence Dev and morph Delayed senescence KO Floweringtime Flowering time Late flowering 379 G578 OE Morphology: other Dev andmorph Lethal when overexpressed 381 G580 OE Flower Dev and morph Altereddevelopment OE Architecture Dev and morph Altered inflorescences 385G584 OE Seed Dev and morph Large seeds 387 G590 KO Seed oil content Seedbiochemistry Increased seed oil content OE Flowering time Flowering timeEarly flowering 389 G591 OE Erysiphe Disease Increased tolerance toErysiphe OE Flowering time Flowering time Late flowering 391 G592 OEFlowering time Flowering time Early flowering 393 G598 OE Seed oilcontent Seed biochemistry Increased seed oil OE Leaf insoluble sugarsLeaf biochemistry Altered insoluble sugars 395 G605 OE Leaf fatty acidsLeaf biochemistry Altered leaf fatty acid composition 399 G615 OEArchitecture Dev and morph Altered plant architecture OE Fertility Devand morph Little or no pollen production, poor filament elongation 401G616 OE Erysiphe Disease Increased tolerance to Erysiphe 403 G624 OESodium chloride Abiotic stress Increased tolerance to high salt Size Devand morph Increased biomass Nutrient uptake Abiotic stress Increasedtolerance to low phosphate Flowering time Flowering time Late flowering405 G627 OE Flowering time Flowering time Early flowering 407 G629 OELeaf Dev and morph Altered leaf morphology OE Seed oil content Seedbiochemistry Increased seed protein content 409 G630 OE Seed proteincontent Seed biochemistry Increased seed protein content, embryospecific expression 415 G634 OE Trichome Dev and morph Increasedtrichome density and size 417 G636 OE Senescence Dev and morph Prematuresenescence OE Size Dev and morph Reduced size 419 G638 OE Flower Dev andmorph Altered flower development 435 G663 OE Seed oil content Seedbiochemistry Decreased seed oil OE Seed protein content Seedbiochemistry Increased seed protein OE Biochemistry: other Biochem: miscIncreased anthocyanins in leaf, root, seed 437 G664 OE Cold Abioticstress Better germination and growth in cold 443 G668 OE Seed proteincontent Seed biochemistry Increased seed protein content OE Seed oilcontent Seed biochemistry Decreased seed oil content OE Seed Dev andmorph Reduced seed color 447 G670 OE Size Dev and morph Small plant 449G671 OE Stem Dev and morph Altered inflorescence stem structure OEFlower Dev and morph Reduced petal abscission OE Leaf Dev and morphAltered leaf shape OE Size Dev and morph Small plant OE Fertility Devand morph Reduced fertility 457 G676 OE Trichome Dev and morph Reducedtrichomes 463 G680 OE Flowering time Flowering time Late flowering OESugar sensing Sugar sensing Reduced germination on glucose medium 465G681 OE Leaf glucosinolates Leaf biochemistry Increase in M39480 467G682 OE Heat Abiotic stress Better germination and growth in heat OETrichome Dev and morph Glabrous, lack of trichomes 473 G718 OE Seedprotein content Seed biochemistry Increased seed protein OE Leaf fattyacids Leaf biochemistry Altered leaf fatty acid composition OE Seedprenyl lipids Seed biochemistry Increased seed lutein OE Seed oilcontent Seed biochemistry Decreased seed oil 483 G732 OE Seed proteincontent Seed biochemistry One OE line had increased, another OE Seed oilcontent Seed biochemistry decreased seed protein content OE ArchitectureDev and morph One OE line had increased, another OE Flower Dev and morphdecreased seed oil content Reduced apical dominance Abnormal flowers 487G736 OE Flowering time Flowering time Late flowering OE Leaf Dev andmorph Altered leaf shape 489 G738 OE Flowering time Flowering time Lateflowering OE Size Dev and morph Reduced size 491 G740 OE Morphology:other Dev and morph Slow growth 497 G748 OE Stem Dev and morph Morevascular bundles in stem OE Flowering time Flowering time Late floweringOE Seed prenyl lipids Seed biochemistry Increased lutein content 503G752 OE Flowering time Flowering time Late flowering 507 G760 OE Hormonesensitivity Hormone sensitivity Hypersensitive to ACC OE Size Dev andmorph Reduced size 519 G776 OE Seed oil composition Seed biochemistryAltered seed fatty acid composition 521 G777 OE Seed oil content Seedbiochemistry Decreased seed oil OE Leaf insoluble sugars Leafbiochemistry Increased leaf rhamnose 523 G778 OE Seed oil compositionSeed biochemistry Increased seed 18:1 fatty acid 525 G779 OE FertilityDev and morph Reduced fertility OE Flower Dev and morph Homeotictransformations 529 G782 OE Sugar sensing Sugar sensing Bettergermination and growth on sucrose medium 531 G783 OE Sugar sensing Sugarsensing Better germination and growth on sucrose medium 539 G789 OESclerotinia Disease Increased susceptibility to Sclerotinia OE Floweringtime Flowering time Early flowering OE Oxidative Abiotic stressIncreased sensitivity 541 G791 OE Seed oil composition Seed biochemistryDecreased seed fatty acid composition OE Leaf insoluble sugars Leafbiochemistry Altered leaf cell wall polysaccharide OE Leaf fatty acidsLeaf biochemistry composition Altered leaf fatty acid composition 549G801 OE Sodium chloride Abiotic stress Better germination on NaCl 553G805 OE Sclerotinia Disease Increased susceptibility to Sclerotinia 557G831 OE Size Dev and morph Reduced size 565 G849 KO Seed protein contentSeed biochemistry Altered seed protein content KO Seed oil content Seedbiochemistry Increased seed oil content 567 G859 OE flowering timeFlowering time Late flowering 571 G861 OE Seed oil composition Seedbiochemistry Increase in 16:1 573 G864 OE Size Dev and morph Small plantOE Cold Abiotic stress Adult stage chilling sensitivity OE Heat Abioticstress Better germination in heat 575 G865 OE Erysiphe Disease Increasedsusceptibility to Erysiphe OE Seed protein content Seed biochemistryIncreased seed protein OE Botrytis Disease Increased susceptibility toBotrytis OE Flowering time Flowering time Early flowering OE Morphology:other Dev and morph Altered morphology 579 G867 OE Sugar sensing Sugarsensing Better seedling vigor on sucrose OE Sodium chloride Abioticstress medium Better seedling vigor on high salt 581 G869 OE Leafinsoluble sugars Leaf biochemistry Increased fucose OE Erysiphe DiseaseIncreased tolerance to Erysiphe OE Morphology: other Dev and morph Smalland spindly plant OE Flower Dev and morph Abnormal anther development OESeed oil composition Seed biochemistry Altered seed fatty acids OE Leaffatty acids Leaf biochemistry Altered leaf fatty acids 583 G877 KOEmbryo lethal Dev and morph Embryo lethal 585 G878 OE Senescence Dev andmorph Delayed senescence OE Flowering time Flowering time Late flowering587 G881 OE Botrytis Disease Increased susceptibility to Botrytis OEErysiphe Disease Increased susceptibility to Erysiphe 589 G883 OE Seedprenyl lipids Seed biochemistry Decreased seed lutein 591 G884 OE Sodiumchloride Abiotic stress Increased root growth in high salt OE Size Devand morph Reduced size 595 G896 KO Fusarium Disease Increasedsusceptibility to Fusarium 603 G903 OE Leaf Dev and morph Altered leafmorphology 605 G905 OE Flowering time Flowering time Late flowering OELeaf Dev and morph Altered leaf shape OE Sugar sensing Sugar sensingIncreased seedling vigor on high glucose 613 G911 OE Nutrient uptakeAbiotic stress Increased growth on potassium-free OE Seed proteincontent Seed biochemistry medium OE Seed oil content Seed biochemistryIncreased seed protein content Decreased seed oil content 615 G912 OEFreezing Abiotic stress Increased freezing tolerance OE Morphology:other Dev and morph Dark green color OE Drought Abiotic stress Increasedsurvival in drought OE Sugar sensing Sugar sensing conditions OE SizeDev and morph Reduced cotyledon expansion in OE Flowering time Floweringtime glucose Small plant Late flowering 619 G921 OE Osmotic Abioticstress Increased sensitivity to osmotic stress OE Leaf Dev and morphSerrated leaves 627 G932 OE Leaf Dev and morph Altered development, darkgreen color OE Size Dev and morph Reduced size 631 G938 OE Seed oilcomposition Seed biochemistry Overexpressors had increased 16:0, 18:0,20:0, and 18:3 fatty acids, decreased 18:2, 20:1, 22:1 fatty acids 633G961 KO Seed oil content Seed biochemistry Increased seed oil content635 G964 OE Heat Abiotic stress More tolerant to heat in germinationassay 637 G965 OE Seed oil composition Seed biochemistry Increase in18:1 fatty acid 639 G971 OE Flowering time Flowering time Late flowering641 G974 OE Seed oil content Seed biochemistry Altered seed oil content643 G975 OE Leaf fatty acids Leaf biochemistry Increased wax in leaves647 G977 OE Size Dev and morph Small plant OE Morphology: other Dev andmorph Dark green OE Leaf Dev and morph Altered leaf shape OE FertilityDev and morph Reduced fertility 649 G979 KO Seed Dev and morph Alteredseed development, ripening, and germination 653 G987 KO Leaf fatty acidsLeaf biochemistry Reduction in 16:3 fatty acid KO Leaf prenyl lipidsLeaf biochemistry Presence of two xanthophylls, tocopherol not normallyfound in leaves; reduced chlorophyll a and b 655 G991 OE Size Dev andmorph Slightly reduced size 657 G994 OE Flowering time Flowering timeLate flowering OE Size Dev and morph Small plants 659 G996 OE Sugarsensing Sugar sensing Reduced germination on glucose medium 671 G1012 OELeaf insoluble sugars Leaf biochemistry Decreased rhamnose 673 G1020 OESize Dev and morph Very small T1 plants 677 G1023 OE Size Dev and morphReduced size 685 G1037 KO Flowering time Flowering time Early flowering687 G1038 OE Leaf Dev and morph Altered leaf shape OE Leaf insolublesugars Leaf biochemistry Decreased insoluble sugars 695 G1048 OEErysiphe Disease Increased tolerance to Erysipheorontii OE Seed proteincontent Seed biochemistry Increased seed protein content 697 G1050 OESenescence Dev and morph Delayed senescence 699 G1052 OE Flowering timeFlowering time Late flowering OE Seed prenyl lipids Seed biochemistryDecrease in lutein and increase in xanthophyll 1 701 G1053 OE Size Devand morph Small plant 713 G1062 KO Light response Dev and morphConstitutive photomorphogenesis KO Hormone sensitivity Hormonesensitivity Altered response to ethylene KO Seed Dev and morph Alteredseed shape KO Morphology: other Dev and morph Slow growth 717 G1067 OELeaf Dev and morph Altered leaf shape OE Size Dev and morph Small plantOE Fertility Dev and morph Reduced fertility 719 G1068 OE Sugar sensingSugar sensing Reduced cotyledon expansion in glucose 721 G1069 OEOsmotic Abiotic stress Better germination under osmotic OE Hormonesensitivity Hormone sensitivity stress OE Leaf glucosinolates Leafbiochemistry Reduced ABA sensitivity Altered composition 723 G1073 OESize Dev and morph Increased plant size OE Leaf Dev and morph Serratedleaves OE Flowering time Flowering time Flowering slightly delayed 725G1075 OE Size Dev and morph Small plant OE Flower Dev and morph Reducedor absent petals, sepals and OE Fertility Dev and morph stamens OE LeafDev and morph Reduced fertility Altered leaf shape 727 G1076 OEMorphology: other Dev and morph Lethal when overexpressed 731 G1089 KOOsmotic Abiotic stress Better germination under osmotic OE Morphology:other Dev and morph stress Developmental defects at seedling stage 737G1128 OE Leaf Dev and morph Dark green leaves OE Senescence Dev andmorph Premature leaf and flower senescence 739 G1133 OE Leaf prenyllipids Leaf biochemistry Decreased leaf lutein 741 G1134 OE Silique Devand morph Siliques with altered shape OE Hormone sensitivity Hormonesensitivity Altered response to the growth hormone ethylene 745 G1136 OEFlowering time Flowering time Late flowering OE Nutrient uptake Abioticstress Increased sensitivity to low nitrogen 749 G1145 OE Seed Seedmorphology Reduced seed size OE Seed Seed morphology Altered seed shape753 G1181 OE Size Dev and morph Small T1 plants 759 G1190 OE Seed oilcontent Seed biochemistry Increased seed oil content 763 G1198 OE Seedoil content Seed biochemistry Increased seed oil content OE Leafglucosinolates Leaf biochemistry Altered glucosinolate composition 775G1228 OE Size Dev and morph Reduced size 787 G1242 OE Flowering timeFlowering time Early flowering 793 G1255 OE Seed Dev and morph Increasedseed size OE Morphology: other Dev and morph Reduced apical dominance OEBotrytis Disease Increased susceptibility to Botrytis 799 G1266 OE Leaffatty acids Leaf biochemistry Changes in leaf fatty acids OE ErysipheDisease Increased tolerance to Erysiphe OE Size Dev and morph Smallplant OE Fertility Dev and morph Reduced fertility OE Leaf insolublesugars Leaf biochemistry Changes in leaf insoluble sugars 801 G1267 OESize Dev and morph Small plant OE Leaf Leaf morphology Dark green leavesOE Leaf Leaf morphology Shiny leaves 803 G1269 OE Leaf Dev and morphLong petioles, upturned leaves 805 G1274 OE Cold Abiotic stressIncreased tolerance to cold OE Nutrient uptake Abiotic stress Increasedtolerance to nitrogen-limited OE Morphology: other Dev and morph mediumOE Leaf Dev and morph Inflorescence architecture Large leaves 807 G1275OE Size Dev and morph Small plant OE Architecture Dev and morph Reducedapical dominance 809 G1277 OE Size Dev and morph Small plant 817 G1304OE Morphology: other Dev and morph Lethal when overexpressed 819 G1305OE Flowering time Flowering time Early flowering OE Heat Abiotic stressReduced chlorosis at high temperature 825 G1309 OE Size Dev and morphSmall plant OE Leaf insoluble sugars Leaf biochemistry Increased mannose827 G1311 OE Fertility Dev and morph Reduced fertility OE Size Dev andmorph Small plant 829 G1313 OE Morphology: other Dev and morph Increasedseedling size 831 G1314 OE Sugar sensing Sugar sensing Reduced seedlingvigor on high OE Size Dev and morph glucose Reduced size 837 G1317 OESize Dev and morph Reduced size 841 G1322 OE Cold Abiotic stressIncreased seedling vigor in cold OE Leaf glucosinolates Leafbiochemistry conditions OE Light response Dev and morph Increase inM39480 OE Size Dev and morph Photomorphogenesis in the dark Reduced size843 G1323 OE Seed oil content Seed biochemistry Decreased seed oil OESeed protein content Seed biochemistry Increased seed protein OE SizeDev and morph Small T1 plants, dark green 845 G1324 OE Leaf prenyllipids Leaf biochemistry Decreased leaf lutein, increased leafxanthophyll 849 G1326 OE Flower Dev and morph Petals and sepals aresmaller OE Size Dev and morph Small plant OE Fertility Dev and morphReduced fertility 851 G1327 OE Leaf Dev and morph Dark green leaves 853G1328 OE Seed prenyl lipids Seed biochemistry Decreased seed lutein 855G1332 OE Trichome Dev and morph Reduced trichome density OE Size Dev andmorph Reduced plant size 857 G1334 OE Size Dev and morph Small plants OELeaf Leaf morphology Dark green leaves 859 G1335 OE Flowering timeFlowering time Late flowering OE Dev and morph Dev and morph Slow growth861 G1337 OE Sugar sensing Sugar sensing Decreased germination onsucrose OE Leaf fatty acids Leaf biochemistry medium Altered leaf fattyacid composition 867 G1380 OE Flowering time Flowering time Earlyflowering 869 G1382 OE Size Dev and morph Small plant 877 G1399 OE Leaffatty acids Leaf biochemistry Altered composition 879 G1412 OE Hormonesensitivity Hormone sensitivity ABA insensitive Osmotic Abiotic stressIncreased tolerance to osmotic stress 881 G1417 KO Morphology: other Devand morph Reduced seedling germination and KO Seed oil composition Seedbiochemistry vigor Increase in 18:2, decrease in 18:3 fatty acids 883G1425 OE Flower Dev and morph Altered flower and inflorescencedevelopment 887 G1435 OE Flowering time Flowering time Late flowering OESize Dev and morph Increased plant size 891 G1449 OE Seed proteincontent Seed biochemistry Increased seed protein content OE Flower Devand morph Altered flower structure 893 G1451 OE Leaf Dev and morph Largeleaf size KO Seed oil content Seed biochemistry Altered seed oil contentOE Morphology: other Dev and morph Increased plant size OE Floweringtime Flowering time Late flowering 895 G1468 OE Flowering time Floweringtime Late flowering Size Dev and morph Increased biomass Leaf Dev andmorph Grayish and narrow leaves Morphology: other Dev and morph Slowgrowth rate 897 G1471 OE Seed oil content Seed biochemistry Increasedseed oil content 899 G1472 OE Morphology: other Dev and morph No shootmeristem 901 G1474 OE Morphology: other Dev and morph Reduced plant sizeOE Flowering time Flowering time Late flowering OE Inflorescence Dev andmorph Inflorescence architecture 903 G1476 OE Morphology: other Dev andmorph Faster seedling growth rate 905 G1482 KO Root Dev and morphIncreased root growth OE Biochemistry: other Biochem: misc Increasedanthocyanins 911 G1493 OE Sugar sensing Sugar sensing Increased seedlingvigor on high OE Flowering time Flowering time glucose OE Leaf Dev andmorph Late flowering Altered leaf shape 913 G1499 OE Architecture Devand morph Altered plant architecture OE Flower Dev and morph Alteredfloral organ identity and OE Morphology: other Dark green colordevelopment Dark green leaves 919 G1540 OE Morphology: other Dev andmorph Reduced cell differentiation in meristem 921 G1545 OE Floweringtime Flowering time Early flowering OE Size Dev and morph Reduced size925 G1560 OE Fertility Dev and morph Reduced fertility OE Size Dev andmorph Reduced size OE Flower Dev and morph Abnormal flowers 927 G1634 OESeed protein content Seed biochemistry Altered seed protein content 929G1645 OE Inflorescence Dev and morph Altered inflorescence structure OELeaf Dev and morph Altered leaf development OE Heat Abiotic stressReduced germination vigor 937 G1760 OE Flowering time Flowering timeEarly flowering 939 G1816 OE Trichome Dev and morph Glabrous OE Root Devand morph Ectopic root hairs, more root hairs OE Nutrient uptake Abioticstress Improved tolerance to low nitrogen OE Sugar sensing Sugar sensingInsensitive to growth retardation OE Pigment Dev and morph effects ofhigh glucose Reduced pigment 941 G1820 OE Osmotic Abiotic stress Bettergermination in NaCl OE Seed protein content Seed biochemistry Increasedseed protein content OE Seed oil content Seed biochemistry Decreasedseed oil content OE Hormone sensitivity Hormone sensitivity Reduced ABAsensitivity OE Flowering time Flowering time Early flowering DroughtAbiotic stress Increased tolerance to drought 943 G1842 OE Floweringtime Flowering time Early flowering 945 G1843 OE Flowering timeFlowering time Early flowering 949 G1947 KO Fertility Dev and morphReduced fertility KO Flower Dev and morph Extended period of flowering951 G2010 OE Flowering time Flowering time Early flowering 957 G2347 OEFlowering time Flowering time Early flowering 959 G2718 OE TrichomeDevel and morph Reduction in trichome density ranging OE Nutrient uptakeAbiotic stress from mild to glabrous OE Root Dev and morph Tolerant tolow nitrogen conditions OE Pigment Dev and morph Ectopic root hairs,more root hairs Reduced pigment

Tables 5A and 5B show the polypeptides identified by SEQ ID NO; MendelGene ID (GID) No.; the transcription factor family to which thepolypeptide belongs, and conserved domains of the polypeptide. The firstcolumn shows the polypeptide SEQ ID NO; the third column shows thetranscription factor family to which the polynucleotide belongs; and thefourth column shows the amino acid residue positions of the conserveddomain in amino acid (AA) co-ordinates. TABLE 5A Gene families andconserved domains Poly- peptide Conserved Domains SEQ ID in Amino AcidNO: GID No. Family Coordinates 2 G3 AP2 28-95 4 G4 AP2 121-188 6 G5 AP2149-216 8 G6 AP2 22-89 10 G7 AP2 58-125 12 G8 AP2 151-217, 243-296 14 G9AP2 62-127 16 G10 AP2 21-88 18 G13 AP2 19-86 20 G14 AP2 122-189 22 G19AP2 76-145 24 G20 AP2 68-144 26 G21 AP2 97-164 28 G22 AP2 89-157 30 G24AP2 25-93 32 G25 AP2 47-114 34 G26 AP2 67-134 36 G27 AP2 37-104 38 G28AP2 145-213 40 G32 AP2 17-84 42 G38 AP2 76-143 44 G40 AP2 45-112 46 G41AP2 39-106 48 G42 AP2 48-115 50 G43 AP2 104-172 52 G44 AP2 85-154 54 G46AP2 107-175 56 G134 MADS 1-57 58 G142 MADS 2-57 60 G145 MADS 1-57 62G146 MADS 1-57 64 G152 MADS 2-57 66 G153 MADS 1-57 68 G156 MADS 2-57 70G157 MADS 2-57 72 G162 MADS 2-57 74 G166 MADS 2-56 76 G170 MADS 2-57 78G171 MADS 1-57 80 G173 MADS 1-57 82 G176 WRKY 117-173, 234-290 84 G177WRKY 166-221, 328-384 86 G179 WRKY 65-121 88 G180 WRKY 118-174 90 G184WRKY 295-352 92 G185 WRKY 113-172 94 G186 WRKY 312-369 96 G187 WRKY172-228 98 G188 WRKY 175-222 100 G190 WRKY 110-169 102 G192 WRKY 128-185104 G194 WRKY 174-230 106 G196 WRKY 223-283 108 G197 MYB-(R1)R2R3 14-119110 G198 MYB-(R1)R2R3 14-117 112 G201 MYB-(R1)R2R3 14-114 114 G203MYB-(R1)R2R3 93-191 116 G206 MYB-(R1)R2R3 13-116 118 G207 MYB-(R1)R2R36-106 120 G208 MYB-(R1)R2R3 14-116 122 G211 MYB-(R1)R2R3 24-137 124 G212MYB-(R1)R2R3 15-116 126 G213 MYB-(R1)R2R3 20-120 128 G214 MYB-related22-71 130 G215 MYB-related 117-184 132 G216 MYB-(R1)R2R3 19-151 134 G220MYB-(R1)R2R3 15-116 136 G222 MYB-(R1)R2R3 13-119 138 G224 PMR 7-114 140G225 MYB-related 39-76 142 G226 MYB-related 28-78 144 G227 MYB-(R1)R2R313-112 146 G228 MYB-related 59-135 148 G229 MYB-(R1)R2R3 14-120 150 G230MYB-(R1)R2R3 13-114 152 G231 MYB-(R1)R2R3 14-118 154 G232 MYB-(R1)R2R314-115 156 G233 MYB-(R1)R2R3 14-114 158 G234 MYB-(R1)R2R3 14-115 160G237 MYB-(R1)R2R3 11-113 162 G239 MYB-(R1)R2R3 21-125 164 G241MYB-(R1)R2R3 14-114 166 G242 MYB-(R1)R2R3 6-105 168 G245 MYB-(R1)R2R314-114 170 G247 MYB-(R1)R2R3 15-116 172 G248 MYB-(R1)R2R3 264-332 174G249 MYB-(R1)R2R3 19-116 176 G251 MYB-(R1)R2R3 9-112 178 G252MYB-(R1)R2R3 14-115 180 G254 MYB-related 62-106 182 G255 MYB-(R1)R2R314-115 184 G256 MYB-(R1)R2R3 13-115 186 G257 MYB-(R1)R2R3 20-120 188G258 MYB-(R1)R2R3 24-124 190 G260 HS 192 G261 HS 194 G263 HS 196 G266 HS45-135 198 G270 AKR 259-424 200 G274 AKR 202 G279 HMG 204 G280 AT-hook97-104, 130-137- 155-162, 185-192 206 G285 MISC 208 G290 SWI/SNF538-784, 919-958, 1086-1169 210 G291 MISC 132-160 212 G295 bZIP 287-354214 G301 216 G307 SCR 323-339 218 G308 SCR 270-274 220 G313 SCR 11-279222 G315 SCR 205-209 224 G325 Z-CO-like 5-28, 48-71 226 G326 Z-CO-like11-94, 354-400 228 G335 Z-Tall-1 205-218 230 G341 Z-C3H 254-374 232 G346GATA/Zn 196-221 234 G348 RING/C3HC4 28-53 236 G351 Z-C2H2 77-97, 118-140238 G361 Z-C2H2 43-63 240 G374 Z-ZPF 35-67, 286-318 242 G378 RING/C3H2C3196-237 244 G384 HB 14-77 246 G385 HB 60-123 248 G389 HB 84-147 250 G390HB 18-81 252 G394 HB 121-182 254 G395 HB 72-135 256 G398 HB 128-191 258G399 HB 260 G404 HB 65-128 262 G409 HB 64-124 264 G413 HB 37-97 266 G414HB 61-124 268 G418 HB 500-560 270 G419 HB 392-452 272 G428 HB 229-292274 G431 HB 286-335 276 G434 HB 39-99 278 G435 HB 4-67 280 G436 HB 22-85282 G437 HB 13-76 284 G438 HB 22-85 286 G447 ARF 22-356 288 G456 IAA7-14, 71-81, 120-153, 185-221 290 G462 IAA 292 G464 IAA 20-28, 71-82,126-142, 187-224 294 G467 IAA 296 G470 ARF 61-393 298 G471 ARF 22-354300 G472 ARF 12-343 302 G475 SBP 53-127 304 G477 SBP 108-233 306 G482CAAT 25-116 308 G486 CAAT 5-66 310 G489 CAAT 57-156 312 G501 NAC 314G502 NAC 10-155 316 G503 NAC 12-158 318 G509 NAC 13-169 320 G511 NAC8-159 322 G512 NAC 24-160 324 G513 NAC 16-161 326 G514 NAC 19-161 328G515 NAC 6-144 330 G516 NAC 10-131 332 G521 NAC 7-156 334 G523 NAC20-140 336 G524 NAC 18-157 338 G525 NAC 23-167 340 G526 NAC 21-149 342G528 GF14 230-237 344 G536 GF14 226-233 346 G545 Z-C2H2 82-102, 136-154348 G548 HS 12-101 350 G553 bZIP 94-160 352 G554 bZIP 82-142 354 G555bZIP 38-110 356 G558 bZIP 45-105 358 G559 bZIP 203-264 360 G561 bZIP248-308 362 G562 bZIP 253-315 364 G563 bZIP 186-248 366 G564 bZIP 22-82368 G566 bZIP 227-290 370 G567 bZIP 210-270 372 G569 bZIP 90-153 374G570 bZIP 370-430 376 G571 bZIP 160-220, 441-452 378 G572 bZIP 120-186380 G578 bZIP 36-96 382 G580 bZIP 162-218 384 G582 HLH/MYC 152-204 386G584 HLH/MYC 401-494 388 G590 HLH/MYC 202-254 390 G591 HLH/MYC 143-240392 G592 HLH/MYC 290-342 394 G598 DBF 205-263 396 G605 AT-hook 132-143398 G610 BPF-1 330-410, 530-622 400 G615 TEO 88-147 402 G616 TEO 39-95404 G624 ABI3/VP-1 327-406 406 G627 MADS 1-57 408 G629 bZIP 92-152 410G630 bZIP 74-146 412 G631 bZIP 212-282 414 G632 TH 70-160 416 G634 TH62-147, 189-245 418 G636 TH 55-145, 405-498 420 G638 TH 119-206 422 G639TH 304-389 424 G640 TH 426 G641 TH 23-102 428 G654 Z-LIM 10-61, 108-159430 G656 MYB-(R1)R2R3 14-114 432 G659 MYB-(R1)R2R3 16-116 434 G661MYB-(R1)R2R3 13-116 436 G663 MYB-(R1)R2R3 9-111 438 G664 MYB-(R1)R2R313-116 440 G665 MYB-related 88-157 442 G667 MYB-(R1)R2R3 14-116 444 G668MYB-(R1)R2R3 13-113 446 G669 MYB-(R1)R2R3 15-118 448 G670 MYB-(R1)R2R314-122 450 G671 MYB-(R1)R2R3 15-115 452 G672 MYB-related 92-161 454 G673MYB-related 37-95 456 G675 MYB-(R1)R2R3 13-116 458 G676 MYB-(R1)R2R317-119 460 G677 MYB-(R1)R2R3 12-116 462 G679 MYB-related 98-167 464 G680MYB-related 24-70 466 G681 MYB-(R1)R2R3 14-120 468 G682 MYB-related27-63 470 G699 HB 52-115 472 G713 HB 23-86 474 G718 SBP 169-242 476 G722GARP 188-236 478 G723 GARP 480 G726 GARP 482 G729 GARP 224-272 484 G732bZIP 31-91 486 G735 bZIP 153-237 488 G736 Z-Dof 54-111 490 G738 Z-Dof351-393 492 G740 Z-CLDSH 24-42, 232-268 494 G743 Z-ZPF 51-82 496 G746RING/C3HC4 139-178 498 G748 Z-Dof 112-140 500 G749 Z-C3H 502 G751 Z-Dof37-82 504 G752 RING/C3H2C3 439-479 506 G759 NAC 17-159 508 G760 NAC12-156 510 G763 NAC 17-157 512 G764 NAC 27-171 514 G765 NAC 23-167 516G767 NAC 8-158 518 G773 NAC 17-159 520 G776 NAC 27-175 522 G777 HLH/MYC47-101 524 G778 HLH/MYC 220-267 526 G779 HLH/MYC 126-182 528 G780HLH/MYC 530 G782 HLH/MYC 1-88 532 G783 HLH/MYC 20-109 534 G784 HLH/MYC139-191 536 G786 HLH/MYC 538 G787 HLH/MYC 61-124 540 G789 HLH/MYC253-313 542 G791 HLH/MYC 75-143 544 G792 HLH/MYC 70-138 546 G793 HLH/MYC151-206 548 G795 DBF 550 G801 PCF 32-93 552 G802 PCF 554 G805 PCF 51-114556 G820 AKR 558 G831 AKR 470-591 560 G832 AKR 562 G837 AKR 548-869 564G838 AKR 274-420 566 G849 BPF-1 324-413, 504-583 568 G859 MADS 2-57 570G860 MADS 2-57 572 G861 MADS 2-57 574 G864 AP2 119-186 576 G865 AP236-103 578 G866 WRKY 43-300 580 G867 AP2 59-124 582 G869 AP2 109-177 584G877 WRKY 272-328, 487-603 586 G878 WRKY 250-305, 415-475 588 G881 WRKY176-233 590 G883 WRKY 245-302 592 G884 WRKY 227-285, 407-465 594 G886BZIPT2 1-53, 542-652 596 G896 Z-LSD-like 18-39 598 G897 Z-CO-like 8-39,51-82 600 G901 Z-CO-like 1-98 602 G902 Z-CO-like 604 G903 Z-C2H2 68-92606 G905 RING/C3H2C3 118-159 608 G907 Z-C3H 124-174 610 G908 Z-C2H2 612G909 Z-LIM 614 G911 RING/C3H2C3 86-129 616 G912 AP2 51-118 618 G915 WRKY184-239, 362-418 620 G921 WRKY 146-203 622 G928 CAAT 179-239 624 G929CAAT 97-158 626 G931 CAAT 172-232 628 G932 MYB-(R1)R2R3 12-118 630 G935ARF 632 G938 EIL 96-104 634 G961 NAC 12-180 636 G964 HB 126-186 638 G965HB 423-486 640 G971 AP2 120-186 642 G974 AP2 81-140 644 G975 AP2 4-71646 G976 AP2 87-153 648 G977 AP2 5-72 650 G979 AP2 63-139, 165-233 652G986 WRKY 146-203 654 G987 SCR 428-432, 704-708 656 G991 IAA 7-14,48-59, 82-115, 128-164 658 G994 MYB-(R1)R2R3 14-123 660 G996MYB-(R1)R2R3 14-114 662 G997 MYB-related 9-63 664 G998 MYB-(R1)R2R328-131 666 G1004 AP2 153-221 668 G1005 AP2 25-92 670 G1006 AP2 114-182672 G1012 WRKY 30-86 674 G1020 AP2 28-95 676 G1022 WRKY 281-340 678G1023 AP2 128-195 680 G1025 SWI/SNF 682 G1030 HMG 684 G1034 bZIP 97-160686 G1037 GARP 11-134, 200-248 688 G1038 GARP 198-247 690 G1039 GARP214-262 692 G1043 WRKY 120-179 694 G1045 bZIP 96-156 696 G1048 bZIP138-190 698 G1050 bZIP 372-425 700 G1052 bZIP 201-261 702 G1053 bZIP74-120 704 G1055 bZIP 192-249 706 G1056 bZIP 183-246 708 G1057 bZIP305-365 710 G1058 bZIP 292-386 712 G1061 HLH/MYC 149-200 714 G1062HLH/MYC 308-359 716 G1065 DBP 718 G1067 AT-hook 86-93 720 G1068 AT-hook143-150 722 G1069 AT-hook 67-74 724 G1073 AT-hook 33-42, 78-175 726G1075 AT-hook 78-85 728 G1076 AT-hook 82-89 730 G1087 BZIPT2 732 G1089BZIPT2 425-500 734 G1090 AP2 19-85 736 G1091 WRKY 262-319 738 G1128AT-hook 181-247 740 G1133 HLH/MYC 256-326 742 G1134 HLH/MYC 198-247 744G1135 HLH/MYC 363-440 746 G1136 HLH/MYC 397-474 748 G1141 AP2 75-142 750G1145 bZIP 227-270 752 G1149 PAZ 870-880 754 G1181 HS 24-114 756 G1183758 G1186 AKR 14-611 760 G1190 AKR 762 G1197 GARP 764 G1198 bZIP 173-223766 G1211 MISC 123-179 768 G1212 MISC 110-131 770 G1216 BPF-1 490-543772 G1218 MISC 66-250, 323-481, 575-645 774 G1222 776 G1228 HLH/MYC179-233 778 G1232 Z-C4HC3 780 G1233 Z-C4HC3 782 G1237 Z-C4HC3 197-245784 G1240 MISC 786 G1241 MISC 788 G1242 SWI/SNF 2-165 790 G1243 SWI/SNF216-609 792 G1249 CAAT 13-110 794 G1255 Z-CO-like 18-56 796 G1258 Z-Dof798 G1261 Z-CO-like 141-182, 184-207 800 G1266 AP2 79-147 802 G1267 WRKY70-127 804 G1269 MYB-related 27-83 806 G1274 WRKY 111-164 808 G1275 WRKY113-169 810 G1277 AP2 18-85 812 G1278 bZIP 230-328 814 G1290 AKR 270-366816 G1300 Z-C4HC3 197-245 818 G1304 MYB-(R1)R2R3 13-118 820 G1305MYB-(R1)R2R3 15-118 822 G1307 MYB-(R1)R2R3 14-114 824 G1308 MYB-(R1)R2R31-128 826 G1309 MYB-(R1)R2R3 9-114 828 G1311 MYB-(R1)R2R3 11-112 830G1313 MYB-(R1)R2R3 32-135 832 G1314 MYB-(R1)R2R3 14-116 834 G1315MYB-(R1)R2R3 14-115 836 G1316 MYB-(R1)R2R3 26-126 838 G1317 MYB-(R1)R2R313-118 840 G1319 MYB-(R1)R2R3 14-114 842 G1322 MYB-(R1)R2R3 26-130 844G1323 MYB-(R1)R2R3 15-116 846 G1324 MYB-(R1)R2R3 20-118 848 G1325MYB-(R1)R2R3 43-147 850 G1326 MYB-(R1)R2R3 18-121 852 G1327 MYB-(R1)R2R314-116 854 G1328 MYB-(R1)R2R3 14-119 856 G1332 MYB-(R1)R2R3 13-116 858G1334 CAAT 18-190 860 G1335 Z-CLDSH 24-43, 131-144, 185-203 862 G1337Z-CO-like 9-75 864 G1362 MYB-related 115-166 866 G1366 bZIP 14-74 868G1380 AP2 24-92 870 G1382 WRKY 210-266, 385-437 872 G1391 GARP 230-277874 G1395 S1FA 1-72 876 G1398 DBF 162-203 878 G1399 AT-hook 86-93 880G1412 NAC 13-162 882 G1417 WRKY 239-296 884 G1425 NAC 20-173 886 G1426NAC 3-131 888 G1435 GARP 146-194 890 G1448 IAA 43-50, 144-154, 187-220,254-290 892 G1449 IAA 48-53, 74-107, 122-152 894 G1451 ARF 22-357 896G1468 Z-C2H2 95-115, 170-190 898 G1471 Z-C2H2 49-70 900 G1472 Z-C2H283-106 902 G1474 Z-C2H2 41-68 904 G1476 Z-C2H2 37-57 906 G1482 Z-CO-like5-63 908 G1483 Z-CO-like 17-66 910 G1490 GARP 193-241 912 G1493 GARP242-289 914 G1499 HLH/MYC 118-181 916 G1504 GATA/Zn 193-206 918 G1508GATA/Zn 38-63 920 G1540 HB 35-98 922 G1545 HB 54-117 924 G1552 SBP101-177 926 G1560 HS 62-151 928 G1634 MYB-related 129-180 930 G1645MYB-(R1)R2R3 90-210 932 G1650 HLH/MYC 284-334 934 G1664 HLH/MYC 936G1669 Z-CO-like 938 G1760 MADS 2-57 940 G1816 MYB-related 31-81 942G1820 CAAT 70-133 944 G1842 MADS 2-57 946 G1843 MADS 2-57 948 G1844 MADS2-57 950 G1947 HS 37-120 952 G2010 SBP 53-127 954 G2119 RING/C3H2C3 956G2120 VAR 0-0 958 G2347 SBP 60-136 960 G2718 MYB-related 21-76

TABLE 5B MAF Gene family and conserved domains Polypeptide Conserved SEQID NO: GID No. Family Domains 568 G859 MADS 2-57 944 G1842 MADS 2-57 946G1843 MADS 2-57 948 G1844 MADS 2-57 1735 G157 MADS 2-57 1875 G1759 MADS2-57 1971 Soy1 MADS 2-57 1973 Soy3 MADS 2-57 1945 G859.3 MADS 2-57 1947G859.1 MADS 2-57 1949 G859.4 MADS 2-57 1951 G859.5 MADS 2-57 1953G1842.2 MADS 2-57 1955 G1842.7 MADS 2-57 1957 G1842.8 MADS 2-57 1959G1842.6 MADS 2-57 1961 G1843.1 MADS 2-57 1963 G1843.2 MADS 2-57 1965G1843.3 MADS 2-57 1967 G1843.4 MADS 2-57 1969 G1844.2 MADS 2-57 1971 SoyMADS1 MADS 2-57 1973 Soy MADS3 MADS 2-57

For many crops, high yielding winter strains can only be grown inregions where the growing season is sufficiently cold and prolonged toelicit vernalization. A system that could trigger flowering at highertemperatures would greatly expand the acreage over which wintervarieties can be cultivated. The finding that G157 (SEQ ID NO:1734)overexpression causes early flowering in Arabidopsis Stockholm andPitztal plants, indicates that the gene can overcome the high level ofFRIGIDA and FLC activity present in those late-ecotypes. The effects aresimilar to those caused by vernalization, which implies that G157 (SEQID NO: 1734) and the related MADS-box transcription factors (MAFs; SEQID NOs: 567, 1944, 1946, 1948, 1950, 943, 1952, 1954, 1956, 1958, 945,1960, 1962, 1964, 1966, 947, 1968, 1970, and 1972) might be used inwinter strains of crop species. To date, a substantial number of geneshave been found to promote flowering. Many, however, including thoseencoding the transcription factors, APETALA1, LEAFY, and CONSTANS,produce extreme dwarfing and/or shoot termination when over-expressed.Importantly, overexpression of G157 was not observed to have deleteriouseffects. In particular, 335S::(G157 transgenic Arabidopsis plants arehealthy and attain a wild-type stature when mature. Irrespective of themode of (G157 action, and whether its true biological role is as anactivator or a repressor of flowering, the results suggest that G1157can be applied to produce either early or late flowering, according tothe level of over-expression of the particular polynucleotide.

Examples of some of the utilities that may be desirable in plants, andthat may be provided by transforming the plants with the presentlydisclosed sequences, are listed in Table 6. Many of the transcriptionfactors listed in Table 6 may be operably linked with a specificpromoter that causes the transcription factor to be expressed inresponse to environmental, tissue-specific or temporal signals. Forexample, G362 induces ectopic trichomes on flowers but also producessmall plants. The former may be desirable to produce insect or herbivoreresistance, or increased cotton yield, but the latter may be undesirablein that it may reduce biomass. However, by operably linking G362 with aflower-specific promoter, one may achieve the desirable benefits of thegene without affecting overall biomass to a significant degree. Forexamples of flower specific promoters, see Kaiser et al. (supra). Forexamples of other tissue-specific, temporal-specific or induciblepromoters, see the above discussion under the heading “Vectors,Promoters, and Expression Systems”. TABLE 6 Genes, traits and utilitiesthat affect plant characteristics Transcription factor genes that TraitCategory Phenotype(s) impact traits Utility Abiotic stress Effect ofchilling on plants Increased sensitivity G394; G864 Improvedgermination, growth Increased tolerance G256; G664; G1274 rate, earlierplanting, yield Germination in cold Increased sensitivity G134 Earlierplanting; improved Increased tolerance G224; G256; G664; survival, yieldG1274; G1322 Freezing tolerance Increased tolerance G912 Earlierplanting; improved quality, survival, yield Drought Increased toleranceG46; G912; G1820 Improved survival, vigor, appearance, yield HeatIncreased sensitivity G3; G1645 Improved germination, growth rate, laterplanting, yield Increased tolerance G464; G682; G864; G964; G1305Osmotic stress Increased sensitivity G502; G526; G921 Abiotic stressresponse manipulation Increased tolerance G188; G325; G489; Improvedgermination rate, G1069; G1089; G1412; survival, yield G1820 Salttolerance Increased sensitivity G545 Manipulation of response to highsalt conditions Increased tolerance G22; G196; G226; Improvedgermination rate, G482; G624; G801; survival, yield; extended growthG867; G884 range Nitrogen stress More sensitive to N limitation G1136Manipulation of response to low nutrient conditions Less sensitive to Nlimitation G153; G225; G226; Improved yield and nutrient G1274; G1816;G2718 stress tolerance, decreased fertilizer usage Phosphate stress Moresensitive to P limitation Manipulation of response to low nutrientconditions Less sensitive to P limitation G545; G624 Improved yield andnutrient stress tolerance, decreased fertilizer usage Potassium stressIncreased tolerance to K G419; G561; G911 Improved yield and nutrientlimitation stress tolerance, decreased fertilizer usage Oxidative stressIncreased sensitivity G477; G789 Improved yield, quality, Increasedtolerance ultraviolet and chemical stress tolerance Herbicide GlyphosateGeneration of glyphosate- resistant plants to improve weed controlHormone sensitivity Abscisic acid (ABA) sensitivity Reduced sensitivityor G1412; G1069; G1820 Modification of seed insensitive to ABAdevelopment, improved seed dormancy, cold and dehydration toleranceSensitivity to ethylene Hypersensitive to ACC G760 Manipulation of fruitripening Altered response G1062; G1134 Manipulation of fruit ripeningInsensitive to ethylene Manipulation of fruit ripening Disease BotrytisIncreased susceptibility G207; G248; G261; Manipulation of response toG865; G881; G1255 disease organism Increased resistance or G28 Improvedyield, appearance, tolerance survival, extended range Fusarium Increasedsusceptibility G188; G545; G896 Manipulation of response to diseaseorganism Increased resistance or Improved yield, appearance, tolerancesurvival, extended range Erysiphe Increased susceptibility G545; G865;G881 Manipulation of response to disease organism Increased resistanceor G19; G28; G237; G378; Improved yield, appearance, tolerance G409;G591; G616; survival, extended range G869; G1048; G1266 PseudomonasIncreased susceptibility G545 Manipulation of response to diseaseorganism Increased resistance or G418; G525 Improved yield, appearance,tolerance survival, extended range Sclerotinia Increased susceptibilityG477; G789; G805 Manipulation of response to disease organism Increasedresistance or G28 Improved yield, appearance, tolerance survival,extended range Growth regulator Altered sugar sensing Alteration ofenergy balance, Decreased tolerance G26; G38; G43; G207; photosyntheticrate, to sugars G241; G254; G263; carbohydrate accumulation, G308; G536;G567; biomass production, source-sink G680; G912; G996; relationships,senescence; G1068; G1314; G1337 alteration of storage compound Increasedtolerance G224; G782; G783; accumulation in seeds to sugars G867; G905;G1493; G1816 Altered C/N sensing G153 Modification of sensing andrespond to changes C and N metabolite levels, regulation of expressionand activity of proteins involved in C and N transport and metabolismFlowering time Early flowering G142; G145; G146; Faster generation time;G157; G180; G184; synchrony of flowering; G185; G208; G227; additionalharvests within a G255; G390; G475; growing season, shortening of G590;G592; G627; breeding programs G789; G865; G1037; G1242; G1305; G1380;G1545; G1760; G1820; G1842; G1843; G2010; G2347 Late flowering G8; G157;G192; G198; Increased yield or biomass, G214; G234; G249; alleviate riskof transgenic G361; G434; G486; pollen escape, synchrony of G562; G571;G591; flowering G624; G680; G736; G738; G748; G752; G859; G878; G903;G9121 G971; G994; G1052; G1073; G1136; G1335; G1435; G1451; G1468;G1474; G1493 Extended period of G1947 Increased fertilization, yield,flowering ornamental applications Development and Altered flowerstructure morphology Stamen G470; G615; G869; Ornamental modification ofG1425; G1499; G1560 plant architecture, improved or Sepal G1075; G1326reduced fertility to mitigate Petal G638; G671; G1075; escape oftransgenic pollen, G1326; G1449; G1499; improved fruit size, shape,G1560 number or yield Pedicel Carpel Homeotic transformation G134; G638;G779 Multiple alterations G187; G580; G615; G671; G732; G1075; G1425;G1449; G1499; G1560; G1645 Enlarged floral organs Siliques G1134 Reducedfertility G559; G615; G638; G671; G779; G977; G1067; G1075; G1266;G1311; G1326; G1560; G1645; G1947 Aerial rosettes Inflorescencearchitectural Ornamental modification of change flower architecture;timing of Altered branching pattern flowering; altered plant habit forShort internodes/bushy G385; G580; G865; yield or harvestabilitybenefit; inflorescences G1274; G1425; G1474 reduction in pollenproduction Internode elongation G1274 of genetically modified plants;Terminal flowers G145 manipulation of seasonality and Poorly developedannual or perennial habit; inflorescences manipulation of determinatevs. Lack of inflorescence G1499 indeterminate growth Altered shootmeristem Ornamental modification of development plant architecture,manipulation Stem bifurcations G390 of growth and development, No shootmeristem G1472 increase in leaf numbers, Reduced meristem cell G1540modulation of branching differentiation patterns to provide improvedyield or biomass Altered branching pattern G438; G1499 Ornamentalmodification of plant architecture, improved lodging resistance Alteredphyllotaxy G638 Ornamental modification of plant architecture Apicaldominance Reduced apical dominance G559; G732; G1255; Ornamentalmodification of G1275; G1645 plant architecture Altered trichomedensity; Ornamental modification of development, or structure plantarchitecture, increased Reduced or no trichomes G25; G212; G225; plantproduct (e.g., diterpenes, G226; G676; G682; cotton) productivity,insect and G1332; G1816; G2718 herbivore resistance Ectopictrichomes/altered G247 trichome distribution or development/cell fateIncreased trichome number, G634 density and/or size Stem morphology andaltered G438; G748 Modulation of lignin content; vascular tissuestructure improvement of wood, palatability of fruits and vegetablesRoot development Increased root growth and G9; G1482 Improved yield,stress tolerance; proliferation anchorage Decreased root growthModification of root architecture and mass Increased root hairs G225;G226; G682; Improved yield, stress tolerance; G1816; G2718 anchorageAltered seed development, G979 Modification of seed ripening andgermination germination properties and performance Cell differentiationand cell G1540 Increase in carpel or fruit proliferation development;improve regeneration of shoots from callus in transformation ormicro-propagation systems Rapid growth and/or Promote faster developmentand development reproduction in plants Slow growth G447; G740; G1062;Ornamental applications G1335; G1468; G1474 Senescence Prematuresenescence G636; G1128 Improvement in response to Delayed senescenceG249; G571; G878; disease, fruit ripening G1050 Lethality whenoverexpressed G374; G515; G578; Herbicide target; ablation of G877;G1076; G1304 specific tissues or organs such as stamen to prevent pollenescape Necrosis G24 Disease resistance Plant size Increased plant sizeG46; G624; G1073; Improved yield, biomass, G1435; G1451; G1468appearance Larger seedlings G1313 Increased survival and vigor ofseedlings, yield Dwarfed or more G3; G5; G21; G24; G27; Dwarfism,lodging resistance, compact plants G184; G194; G258; manipulation ofgibberellin G280; G385; G447; responses G636; G670; G671; G738; G760;G831; G864; G869; G884; G932; G977; G991; G994; G1020; G1023; G1053;G1067; G1075; G1181; G1228; G1266; G1267; G1275; G1277; G1309; G1311;G1314; G1317; G1322; G1323; G1326; G1332; G1334; G1382; G1474; G1545;G1560 Leaf morphology Dark green leaves G385; G447; G912; Increasedphotosynthesis, G932; G977; G1128; biomass, yield; ornamental G1267;G1323; G1327; applications G1334; G1499 Change in leaf shape G32; G224;G428; Ornamental applications G464; G629; G671; G736; G903; G905; G921;G932; G977; G1038; G1067; G1073; G1075; G1269; G1468; G1493; G1645Altered leaf development G1645 Ornamental applications Altered leaf sizeIncreased leaf size G438; G1274; G1451 Increased yield, ornamental andmass applications Gray leaves G1468 Ornamental applications VariegationOrnamental applications Glossy leaves G975; G1267 Ornamentalapplications, manipulation of wax composition, amount, or distributionCell expansion G521 Ornamental applications; modification of adaptationto environmental changes or damage Seed morphology Altered seedcoloration G156; G663; G668 Appearance, ornamental applications Seedsize and shape Increased seed size G206; G584; G1255 Yield, appearanceDecreased seed size G1145 Appearance Altered seed shape G1062; G1145Appearance Leaf biochemistry Increased leaf wax G975; G1267 Insect,pathogen resistance Leaf prenyl lipids Improved antioxidant and Reducedchlorophyll vitamin E content, nutritional Increase in tocopherols G280;G987 content; prevention of ARMD; Increased lutein content G214 modifiedphotosynthetic Decreased lutein content G1133; G1324; G1328 capabilityIncrease in chlorophyll or G214; G987; G1324 carotenoids Decrease inchlorophyll or G987 carotenoids Leaf insoluble sugars Alteration ofplant cell wall Increased leaf insoluble G211; G237; G242; compositionaffecting food sugars including xylose, G274; G307; G428; digestibility,plant tensile fucose, rhamnose, G435; G525; G598; strength, woodquality, pathogen galactose, arabinose or G777; G869; G1309 resistanceand pulp production. mannose Decreased leaf insoluble G307; G428; G1012Alteration of plant cell wall sugars, including composition affectingfood rhamnose, arabinose or digestibility, plant tensile mannosestrength, wood quality, pathogen resistance and pulp production.Increased leaf anthocyanins G663 Increased photosynthesis, biomass,yield; ornamental applications Leaf fatty acids Modification ofnutritional Reduction in leaf G718; G1266; G1347 content; heat stabilityof fatty acids essential oils Increase in leaf G214; G231 fatty acidsAltered leaf G185; G681; G1069; Modification of toxic glucosinolatecontent G1198; G1322 glucosinolate content in animal feeds; anti-canceractivity Seed biochemistry Seed oil content Increased oil content G162;G229; G231; Improved oil yield G291; G456; G464; G561; G590; G598; G732;G849; G961; G1190; G1198 Decreased oil content G180; G192; G201; Reducedcaloric content G222; G241; G663; G668; G718; G732; G777; G911; G1323;G1820 Altered oil content G509; G567; G732; Modification of seed caloricG974; G1451; G1471 content and nutritional value Seed protein contentIncreased protein content G201; G222; G226; Improved protein yield,G241; G629; G630; nutritional value G663; G668; G718; G732; G865; G911;G1048; G1323; G1449; G1820 Decreased protein content G229; G231; G418;Reduced caloric content G456; G464; G732; G1634 Altered protein contentG162; G509; G567; Modification of seed caloric G732; G849 content andnutritional value Altered seed prenyl G214; G718; G748; Improvedantioxidant and lipid content G883; G1052 vitamin E content; preventionof ARMD Seed glucosinolate Modification of toxic Altered profileglucosinolate content in animal feeds; anti-cancer activity Increasedseed anthocyanins Ornamental applications Increased seed sterols G20Alteration of human steroid precursors content, some of which havecholesterol-lowering activity Increased seed fatty acid G778; G861;G869; Modification of seed caloric composition G938; G965; G1399;content and nutritional value G1417 Decreased seed fatty acid G776;G791; G938; Modification of seed caloric composition G987; G1417 contentand nutritional value Up-regulation of genes G229 Alteration oftolerance to insect involved herbivores, UV, oxidative stress, insecondary metabolism and pathogen attack; increased production ofvaluable alkaloid- base medicines Root Biochemistry Increased rootanthocyanins G663 Ornamental applications, improved anti-cancer activityLight response/shade Altered cotyledon, hypocotyl, G351; G1062; G1322Potential for increased planting avoidance petiole development;densities and yield enhancement altered leaf orientation; constitutivephotomorphogenesis; photomorphogenesis in low light Pigment Increasedanthocyanin level G663; G1482 Enhanced health benefits, improvedornamental appearance, increased stress resistance, attraction ofpollinating and seed-dispersing animalsAbbreviations:N = nitrogenP = phosphateABA = abscisic acidC/N = carbon/nitrogen balanceDetailed Description of Genes, Traits and Utilities that Affect PlantCharacteristics

The following descriptions of traits and utilities associated with thepresent transcription factors offer a more comprehensive descriptionthan that provided in Table 6.

Abiotic Stress, General Considerations

Plant transcription factors can modulate gene expression, and, in turn,be modulated by the environmental experience of a plant. Significantalterations in a plant's environment invariably result in a change inthe plant's transcription factor gene expression pattern. Alteredtranscription factor expression patterns generally result in phenotypicchanges in the plant. Transcription factor gene product(s) in transgenicplants then differ(s) in amounts or proportions from that found inwild-type or non-transformed plants, and those transcription factorslikely represent polypeptides that are used to alter the response to theenvironmental change. By way of example, it is well accepted in the artthat analytical methods based on altered expression patterns may be usedto screen for phenotypic changes in a plant far more effectively thancan be achieved using traditional methods.

Abiotic stress: adult stage chilling. Enhanced chilling tolerance mayextend the effective growth range of chilling sensitive crop species byallowing earlier planting or later harvest a gene that enhances growthunder such conditions could also enhance yields, extend the effectivegrowth range of chilling sensitive crop species, and reduce fertilizerand herbicide usage. Chilling tolerance could also serve as a model forunderstanding how plants adapt to water deficit. Both chilling and waterstress share similar signal transduction pathways andtolerance/adaptation mechanisms. For example, acclimation to chillingtemperatures can be induced by water stress or treatment with abscisicacid. Genes induced by low temperature include dehydrins (or LEAproteins). Dehydrins are also induced by salinity, abscisic acid, waterstress, and during the late stages of embryogenesis.

Another large impact of chilling occurs during post-harvest storage. Forexample, some fruits and vegetables do not store well at lowtemperatures (for example, bananas, avocados, melons, and tomatoes). Thenormal ripening process of the tomato is impaired if it is exposed tocool temperatures. Transcription factor genes conferring resistance tochilling temperatures, including G256, G664, G1274 and their equivalogs,may thus enhance tolerance during post-harvest storage.

Improved chilling tolerance may be conferred by increased expression ofglycerol-3-phosphate acetyltransferase in chloroplasts (see, forexample, Wolter et al. (1992) et al. EMBO J. 4685-4692.

Chilling tolerance could also serve as a model for understanding howplants adapt to water deficit. Both chilling and water stress sharesimilar sensory transduction pathways and tolerance/adaptationmechanisms. For example, acclimation to chilling temperatures can beinduced by water stress or treatment with abscisic acid. Genes inducedby low temperature include dehydrins (or LEA proteins). Dehydrins arealso induced by salinity, abscisic acid, water stress or during the latestages of embryogenesis. Thus, genes that protect the plant againstchilling could also have a role in protection against water deficit.

Abiotic stress: cold germination. Several of the presently disclosedtranscription factor genes confer better germination and growth in coldconditions. For example, the improved germination in cold conditionsseen with G224, G256, G664, G1274, and G1322, indicates a role inregulation of cold responses by these genes and their equivalogs. Thesegenes might be engineered to manipulate the response to low temperaturestress. Genes that would allow germination and seedling vigor in thecold would have highly significant utility in allowing seeds to beplanted earlier in the season with a high rate of survival.Transcription factor genes that confer better survival in coolerclimates allow a grower to move up planting time in the spring andextend the growing season further into autumn for higher crop yields.Germination of seeds and survival at temperatures significantly belowthat of the mean temperature required for germination of seeds andsurvival of non-transformed plants would increase the potential range ofa crop plant into regions in which it would otherwise fail to thrive.

Abiotic stress: freezing tolerance and osmotic stress. Presentlydisclosed transcription factor genes, including G188, G325, G489, G1069,G1089, G1412, G1820 and their equivalogs, that may increase germinationrate and/or growth under adverse osmotic conditions, could impactsurvival and yield of seeds and plants. Osmotic stresses may beregulated by specific molecular control mechanisms that include genescontrolling water and ion movements, functional and structuralstress-induced proteins, signal perception and transduction, and freeradical scavenging, and many others (Wang et al. (2001) Acta Hort.(1SHS) 560: 285-292). Instigators of osmotic stress include freezing,drought and high salinity, each of which are discussed in more detailbelow.

In many ways, freezing, high salt and drought have similar effects onplants, not the least of which is the induction of common polypeptidesthat respond to these different stresses. For example, freezing issimilar to water deficit in that freezing reduces the amount of wateravailable to a plant. Exposure to freezing temperatures may lead tocellular dehydration as water leaves cells and forms ice crystals inintercellular spaces (Buchanan, supra). As with high salt concentrationand freezing, the problems for plants caused by low water availabilityinclude mechanical stresses caused by the withdrawal of cellular water.Thus, the incorporation of transcription factors that modify a plant'sresponse to osmotic stress or improve tolerance to (e.g., by G912 or itsequivalogs) into, for example, a crop or ornamental plant, may be usefulin reducing damage or loss. Specific effects caused by freezing, highsalt and drought are addressed below.

Abiotic stress: drought and low humidity tolerance. Exposure todehydration invokes similar survival strategies in plants as doesfreezing stress (see, for example, Yelenosky (1989) Plant Physiol 89:444-451) and drought stress induces freezing tolerance (see, forexample, Siminovitch et al. (1982) Plant Physiol 69: 250-255; and Guy etal. (1992) Planta 188: 265-270). In addition to the induction ofcold-acclimation proteins, strategies that allow plants to survive inlow water conditions may include, for example, reduced surface area, orsurface oil or wax production. A number of presently disclosedtranscription factor genes, e.g., G46, G912, and G1820, increase aplant's tolerance to low water conditions and, along with theirequivalogs, would provide the benefits of improved survival, increasedyield and an extended geographic and temporal planting range.

Abiotic stress: heat stress tolerance. The germination of many crops isalso sensitive to high temperatures. Presently disclosed transcriptionfactor genes that provide increased heat tolerance, including G464,G682, G864, G964, G1305 and their equivalogs, would be generally usefulin producing plants that germinate and grow in hot conditions, may findparticular use for crops that are planted late in the season, or extendthe range of a plant by allowing growth in relatively hot climates.

Abiotic stress: salt. The genes in Table 6 that provide tolerance tosalt may be used to engineer salt tolerant crops and trees that canflourish in soils with high saline content or under drought conditions.In particular, increased salt tolerance during the germination stage ofa plant enhances survival and yield. Presently disclosed transcriptionfactor genes, including G22, G196, G226, G482, G624, G801, G867, G884,and their equivalogs that provide increased salt tolerance duringgermination, the seedling stage, and throughout a plant's life cycle,would find particular value for imparting survival and yield in areaswhere a particular crop would not normally prosper.

Nutrient uptake and utilization: nitrogen and phosphorus. Presentlydisclosed transcription factor genes introduced into plants provide ameans to improve uptake of essential nutrients, including nitrogenouscompounds, phosphates, potassium, and trace minerals. The enhancedperformance of, for example, G153, G225, G226, G682, G1274, G1816, G2718and other overexpressing lines under low nitrogen, and G545 and G624under low phosphorous conditions indicate that these genes and theirequivalogs can be used to engineer crops that could thrive underconditions of reduced nutrient availability. Phosphorus, in particular,tends to be a limiting nutrient in soils and is generally added as acomponent in fertilizers. Young plants have a rapid intake of phosphateand sufficient phosphate is important for yield of root crops such ascarrot, potato and parsnip.

The effect of these modifications is to increase the seedlinggermination and range of ornamental and crop plants. The utilities ofpresently disclosed transcription factor genes conferring tolerance toconditions of low nutrients also include cost savings to the grower byreducing the amounts of fertilizer needed, environmental benefits ofreduced fertilizer runoff into watersheds; and improved yield and stresstolerance. In addition, by providing improved nitrogen uptakecapability, these genes can be used to alter seed protein amounts and/orcomposition in such a way that could impact yield as well as thenutritional value and production of various food products.

A number of the transcription factor-overexpressing lines make lessanthocyanin on high sucrose plus glutamine indicates that these genescan be used to modify carbon and nitrogen status, and hence assimilatepartitioning (assimilate partitioning refers to the manner in which anessential element, such as nitrogen, is distributed among differentpools inside a plant, generally in a reduced form, for the purpose oftransport to various tissues).

Increased tolerance of plants to oxidative stress. In plants, as in allliving things, abiotic and biotic stresses induce the formation ofoxygen radicals, including superoxide and peroxide radicals. This hasthe effect of accelerating senescence, particularly in leaves, with theresulting loss of yield and adverse effect on appearance. Generally,plants that have the highest level of defense mechanisms, such as, forexample, polyunsaturated moieties of membrane lipids, are most likely tothrive under conditions that introduce oxidative stress (e.g., highlight, ozone, water deficit, particularly in combination). Introductionof the presently disclosed transcription factor genes, including G477,G789 and their equivalogs, that increase the level of oxidative stressdefense mechanisms would provide beneficial effects on the yield andappearance of plants. One specific oxidizing agent, ozone, has beenshown to cause significant foliar injury, which impacts yield andappearance of crop and ornamental plants. In addition to reduced foliarinjury that would be found in ozone resistant plant created bytransforming plants with some of the presently disclosed transcriptionfactor genes, the latter have also been shown to have increasedchlorophyll fluorescence (Yu-Sen Changet al. (2001) Bot. Bull. Acad.Sin. 42: 265-272).

Decreased herbicide sensitivity. Presently disclosed transcriptionfactor genes that confer resistance or tolerance to herbicides (e.g.,glyphosate) will find use in providing means to increase herbicideapplications without detriment to desirable plants. This would allow forthe increased use of a particular herbicide in a local environment, withthe effect of increased detriment to undesirable species and less harmto transgenic, desirable cultivars.

Knockouts of a number of the presently disclosed transcription factorgenes, including G374 and G877, have been shown to be lethal todeveloping embryos. Thus, these genes and their equivalogs arepotentially useful as herbicide targets.

Hormone sensitivity. ABA plays regulatory roles in a host ofphysiological processes in all higher as well as in lower plants (Davieset al. (1991) Abscisic Acid: Physiology and Biochemistry. BiosScientific Publishers, Oxford, UK; Zeevaart et al. (1988) Ann Rev PlantPhysiol. Plant Mol. Biol. 49: 439-473; Shimizu-Sato et al. (2001) PlantPhysiol 127: 1405-1413). ABA mediates stress tolerance responses inhigher plants, is a key signal compound that regulates stomatal apertureand, in concert with other plant signaling compounds, is implicated inmediating responses to pathogens and wounding or oxidative damage (forexample, see Larkindale et al. (2002) Plant Physiol. 128: 682-695). Inseeds, ABA promotes seed development, embryo maturation, synthesis ofstorage products (proteins and lipids), desiccation tolerance, and isinvolved in maintenance of dormancy (inhibition of germination), andapoptosis (Zeevaart et al. (1988) Ann Rev Plant Physiol. Plant Mol.Biol. 49: 439-473; Davies (1991), supra; Thomas (1993) Plant Cell 5:1401-1410; and Bethke et al. (1999) Plant Cell 11: 1033-1046). ABA alsoaffects plant architecture, including root growth and morphology androot-to-shoot ratios. ABA action and metabolism is modulated not only byenvironmental signals but also by endogenous signals generated bymetabolic feedback, transport, hormonal cross-talk and developmentalstage. Manipulation of ABA levels, and hence by extension thesensitivity to ABA, has been described as a very promising means toimprove productivity, performance and architecture in plants Zeevaart(1999) in: Biochemistry and Molecular Biology of Plant Hormones,Hooykaas et al. eds, Elsevier Science pp 189-207; and Cutler et al.(1999) Trends Plant Sci. 4: 472-478).

A number of the presently disclosed transcription factor genes affectplant abscisic acid (ABA) sensitivity, including G1069, G1412, andG1820. Thus, by affecting ABA sensitivity, these introducedtranscription factor genes and their equivalogs would affect cold,drought, oxidative and other stress sensitivities, plant architecture,and yield.

Several other of the present transcription factor genes have been usedto manipulate ethylene signal transduction and response pathways. Thesegenes, including G760, G1062, G1134 and their equivalogs, may thus beused to manipulate the processes influenced by ethylene, such as seedgermination or fruit ripening, and to improve seed or fruit quality.

Diseases pathogens and pests. A number of the presently disclosedtranscription factor genes have been shown to or are likely to affect aplants response to various plant diseases, pathogens and pests. Theoffending organisms include fungal pathogens Fusarium oxysporum,Botrytis cinerea, Sclerotinia sclerotiorum, and Erysiphe orontii.Bacterial pathogens to which resistance may be conferred includePseudomonas syringae. Other problem organisms may potentially includenematodes, mollicutes, parasites, or herbivorous arthropods. In eachcase, one or more transformed transcription factor genes may providesome benefit to the plant to help prevent or overcome infestation, or beused to manipulate any of the various plant responses to disease. Thesemechanisms by which the transcription factors work could includeincreasing surface waxes or oils, surface thickness, or the activationof signal transduction pathways that regulate plant defense in responseto attacks by herbivorous pests (including, for example, proteaseinhibitors). Another means to combat fungal and other pathogens is byaccelerating local cell death or senescence, mechanisms used to impairthe spread of pathogenic microorganisms throughout a plant. Forinstance, the best known example of accelerated cell death is theresistance gene-mediated hypersensitive response, which causes localizedcell death at an infection site and initiates a systemic defenseresponse. Because many defenses, signaling molecules, and signaltransduction pathways are common to defense against different pathogensand pests, such as fungal, bacterial, oomycete, nematode, and insect,transcription factors that are implicated in defense responses againstthe fungal pathogens tested may also function in defense against otherpathogens and pests. These transcription factors include, for example,G28 (improved resistance or tolerance to Botrytis), G19, G28, G237,G378, G409, G591, G616, G869, G1048, G1266 (improved resistance ortolerance to Erysiphe), G418, G525 (improved resistance or tolerance toPseudomonas), G28 (improved resistance or tolerance to Sclerotinia), andtheir equivalogs.

Growth regulator: sugar sensing. In addition to their important role asan energy source and structural component of the plant cell, sugars arecentral regulatory molecules that control several aspects of plantphysiology, metabolism and development (Hsieh et al. (1998) Proc. Natl.Acad. Sci. 95: 1396513970). It is thought that this control is achievedby regulating gene expression and, in higher plants, sugars have beenshown to repress or activate plant genes involved in many essentialprocesses such as photosynthesis, glyoxylate metabolism, respiration,starch and sucrose synthesis and degradation, pathogen response,wounding response, cell cycle regulation, pigmentation, flowering andsenescence. The mechanisms by which sugars control gene expression arenot understood.

Because sugars are important signaling molecules, the ability to controleither the concentration of a signaling sugar or how the plant perceivesor responds to a signaling sugar could be used to control plantdevelopment, physiology or metabolism. For example, the flux of sucrose(a disaccharide sugar used for systemically transporting carbon andenergy in most plants) has been shown to affect gene expression andalter storage compound accumulation in seeds. Manipulation of thesucrose signaling pathway in seeds may therefore cause seeds to havemore protein, oil or carbohydrate, depending on the type ofmanipulation. Similarly, in tubers, sucrose is converted to starch whichis used as an energy store. It is thought that sugar signaling pathwaysmay partially determine the levels of starch synthesized in the tubers.The manipulation of sugar signaling in tubers could lead to tubers witha higher starch content.

Thus, the presently disclosed transcription factor genes that manipulatethe sugar signal transduction pathway, including G26, G38, G43, G207,G224, G241, G254, G263, G308, G536, G567, G680, G782, G783,G867, G905,G912, G996, G1068, G1314, G1347, and G1493, along with their equivalogs,may lead to altered gene expression to produce plants with desirabletraits. In particular, manipulation of sugar signal transductionpathways could be used to alter source-sink relationships in seeds,tubers, roots and other storage organs leading to increase in yield.

Growth regulator: C/N sensing. Nitrogen and carbon metabolism aretightly linked in almost every biochemical pathway in the plant. Carbonmetabolites regulate genes involved in N acquisition and metabolism, andare known to affect germination and the expression of photosyntheticgenes (Coruzzi et al. (2001) Plant Physiol. 125: 61-64) and hencegrowth. Early studies on nitrate reductase (NR) in 1976 showed that NRactivity could be affected by Glc/Suc (Crawford (1995) Plant Cell 7:859-886; Daniel-Vedele et al. (1996) CR Acad Sci Paris 319: 961-968).Those observations were supported by later experiments that showedsugars induce NR mRNA in dark-adapted, green seedlings (Cheng C L, etal. (1992) Proc Natl Acad Sci USA 89: 1861-1864). C and N may haveantagonistic relationships as signaling molecules; light induction of NRactivity and mRNA levels can be mimicked by C metabolites andN-metabolites cause repression of NR induction in tobacco (Vincentz etal. (1992) Plant J 3: 315-324). Gene regulation by C/N status has beendemonstrated for a number of N-metabolic genes (Stitt (1999) Curr. Opin.Plant. Biol. 2: 178-186); Coruzzi et al. (2001) supra). Thus,transcription factor genes that affect C/N sensing such as G153 or itsequivalogs can be used to alter or improve germination and growth undernitrogen-limiting conditions.

Flowering Time: Early, Late and Inducible Flowering.

Early flowering. Presently disclosed transcription factor genes thataccelerate flowering, which include G142, G145, G146, G153, G157, G180,G184, G185, G208, G227, G255, G390, G475, G590, G592, G627, G789, G865,G1037, G1242, G1305, G1380, G1545, G1760, G1820, G1842, G1843, G1844,G2010, G2347, and their functional equivalogs, could have valuableapplications in such programs, since they allow much faster generationtimes. In a number of species, for example, broccoli, cauliflower, wherethe reproductive parts of the plants constitute the crop and thevegetative tissues are discarded, it would be advantageous to acceleratetime to flowering. Accelerating flowering could shorten crop and treebreeding programs. Additionally, in some instances, a faster generationtime would allow additional harvests of a crop to be made within a givengrowing season. A number of Arabidopsis genes have already been shown toaccelerate flowering when constitutively expressed. These include LEAFY,APETALA1 and CONSTANS (Mandel et al. (1995) Nature 377: 522-524; Weigeland Nilsson (1995) Nature 377:et al. 495-500; Simon et al. (1996) Nature384: 59-62).

With the advent of transformation systems for tree species such as oilpalm and Eucalyptus, forest biotechnology is a growing area of interest.Acceleration of flowering, again, might reduce generation times and makebreeding programs feasible which would otherwise be impossible, such aswith plants with multi-year cycles (such as biennials, e.g. carrot, orfruit trees, such as citrus) that can be very slow to develop and beginflowering. That this is a real possibility has already been demonstratedin aspen, a tree species that usually takes 8-20 years to flower.Transgenic aspen that over-express the Arabidopsis LFY gene flower afteronly 5 months. The flowers produced by these young aspen plants,however, were sterile; the challenge of producing fertile earlyflowering trees therefore still remains (Weigel, D. and Nilsson, O.,1995, Nature 377, 495-500).

Breeding programs for the development of new varieties can be limited bythe seed-to-seed cycle.

Inducible flowering. By regulating the expression of potential floweringusing inducible promoters, flowering could be triggered by applicationof an inducer chemical. This would allow flowering to be synchronizedacross a crop and facilitate more efficient harvesting (e.g.,strawberry). Such inducible systems could also be used to tune theflowering of crop varieties to different latitudes. At present, speciessuch as soybean and cotton are available as a series of maturity groupsthat are suitable for different latitudes on the basis of theirflowering time (which is governed by day-length). A system in whichflowering could be chemically controlled would allow a singlehigh-yielding northern maturity group to be grown at any latitude. Insouthern regions such plants could be grown for longer periods beforeflowering was induced, thereby increasing yields. In more northernareas, the induction would be used to ensure that the crop flowers priorto the first winter frosts. Currently, the existence of a series ofmaturity groups for different latitudes represents a major barrier tothe introduction of new valuable traits. Any trait (e.g. diseaseresistance) has to be bred into each of the different maturity groupsseparately; a laborious and costly exercise. The availability of singlestrain, which could be grown at any latitude, would therefore greatlyincrease the potential for introducing new traits to crop species suchas soybean and cotton.

Late flowering. In a sizeable number of species, for example, rootcrops, where the vegetative parts of the plants constitute the crop(e.g., onions, lettuce) and the reproductive tissues are discarded, itis advantageous to identify and incorporate transcription factor genesthat delay or prevent flowering in order to prevent resources beingdiverted into reproductive development. For example, G8, G157, G192,G198, G214, G234, G249, G361, G434, G486, G562, G571, G591, G624, G680,G736, G738, G748, G752, G859, G878, G903, G9121 G971, G994, 61052,G1073, G1136, G1335, G1435, G1451, G1468, G1474, G1493, and equivalogs,delay flowering time in transgenic plants. Extending vegetativedevelopment with presently disclosed transcription factor genes couldthus bring about large increases in yields. Prevention of flowering canhelp maximize vegetative yields and prevent escape of geneticallymodified organism (GMO) pollen.

Presently disclosed transcription factors that extend flowering time,including G1947, have utility in engineering plants with longer-lastingflowers for the horticulture industry, and for extending the time inwhich the plant is fertile.

A number of the presently disclosed transcription factors may extendflowering time, and delay flower abscission, which would have utility inengineering plants with longer-lasting flowers for the horticultureindustry. This would provide a significant benefit to the ornamentalindustry, for both cut flowers and woody plant varieties (of, forexample, maize), as well as have the potential to lengthen the fertileperiod of a plant, which could positively impact yield and breedingprograms.

General development and morphology: flower structure and inflorescence:architecture, altered flower organs, reduced fertility, multiplealterations, aerial rosettes, branching, internode distance, terminalflowers and phase change. Presently disclosed transgenic transcriptionfactors such as G134, G187, G470, G580, G615, G638, G671, G732, G779,G869, G1075, G1134, G1326, G1425, G1449, G1499, G1645, and theirequivalogs, may be used to create plants with larger flowers orarrangements of flowers that are distinct from wild-type ornon-transformed cultivars. This would likely have the most value for theornamental horticulture industry, where larger flowers or interestingfloral configurations are generally preferred and command the highestprices.

Flower structure may have advantageous or deleterious effects onfertility, and could be used, for example, to decrease fertility by theabsence, reduction or screening of reproductive components. In fact,plants that overexpress a sizable number of the presently disclosedtranscription factor genes e.g., G559, G615, G638, G671, G779, G977,G1067, G1075, G1266, G131 1, G1326, G1645, G1947 and their functionalequivalogs, possess reduced fertility; flowers are infertile and fail toyield seed. These could be desirable traits, as low fertility could beexploited to prevent or minimize the escape of the pollen of geneticallymodified organisms (GMOs) into the environment.

The morphological phenotype shown by plants overexpressing some of thepresent transcription factors indicate that these genes and theirequivalogs may be used to alter inflorescence architecture. Inparticular, a reduction in pedicel length and a change in the positionat which flowers and fruits are held, might influence harvesting orpollination efficiency. Additionally, such changes may produceattractive novel forms for the ornamental markets.

One interesting application for manipulation of flower structure, forexample, by introduced transcription factors could be in the increasedproduction of edible flowers or flower parts, including saffron, whichis derived from the stigmas of Crocus sativus.

Genes that later silique conformation in brassicates may be used tomodify fruit ripening processes in brassicates and other plants, whichmay positively affect seed or fruit quality.

A number of the presently disclosed transcription factors may affect thetiming of phase changes in plants. Since the timing or phase changesgenerally affects a plant's eventual size, these genes may provebeneficial by providing means for improving yield and biomass.

General development and morphology: shoot meristem and branchingpatterns. Several of the presently disclosed transcription factor genes,including G390, when introduced into plants, have been shown to causestem bifurcations in developing shoots in which the shoot meristemssplit to form two or three separate shoots. These transcription factorsand their functional equivalogs may thus be used to manipulatebranching. This would provide a unique appearance, which may bedesirable in ornamental applications, and may be used to modify lateralbranching for use in the forestry industry. A reduction in the formationof lateral branches (e.g., with G1499 or equivalogs) could reduce knotformation. Conversely, increasing the number of lateral branches (e.g.,with G438 or equivalogs) could provide utility when a plant is used as aview- or windscreen.

General development and morphology: apical dominance: The modifiedexpression of presently disclosed transcription factors (e.g., G559,G732, G1255, G1275, G1645, and their equivalogs) that reduce apicaldominance could be used in ornamental horticulture, for example, tomodify plant architecture, for example, to produce a shorter, more bushystature than wild type. The latter form would have ornamental utility aswell as provide increased resistance to lodging.

General development and morphology: trichome density, development orstructure. Several of the presently disclosed transcription factor geneshave been used to modify trichome number, density, trichome cell fate,amount of trichome products produced by plants, or produce ectopictrichome formation. These may include G25, G212, G225, G226, G247, G634,G676, G682, G1332 G1816, G2718, and their equivalogs. In most caseswhere the metabolic pathways are impossible to engineer, increasingtrichome density or size on leaves may be the only way to increase plantproductivity. Thus, by increasing trichome density, size or type, thesetrichome-affecting genes and their functional equivalogs would haveprofound utilities in molecular farming practices by making use oftrichomes as a manufacturing system for complex secondary metabolites.

Trichome glands on the surface of many higher plants produce and secreteexudates that give protection from the elements and pests such asinsects, microbes and herbivores. These exudates may physicallyimmobilize insects and spores, may be insecticidal or ant-microbial orthey may act as allergens or irritants to protect against herbivores. Bymodifying trichome location, density or activity with presentlydisclosed transcription factors that modify these plant characteristics,plants that are better protected and higher yielding may be the result.

A potential application for these trichome-affecting genes and theirequivalogs also exists in cotton: cotton fibers are modified unicellulartrichomes that develop from the outer ovule epidermis. In fact, onlyabout 30% of these epidermal cells develop into trichomes, but all havethe potential to develop a trichome fate. Trichome-affecting genes cantrigger an increased number of these cells to develop as trichomes andthereby increase the yield of cotton fibers. Since the mallow family isclosely related to the Brassica family, genes involved in trichomeformation will likely have homologs in cotton or function in cotton.

If the effects on trichome patterning reflect a general change inheterochronic processes, trichome-affecting transcription factors ortheir equivalogs can be used to modify the way meristems and/or cellsdevelop during different phases of the plant life cycle. In particular,altering the timing of phase changes could afford positive effects onyield and biomass production.

General development and morphology: stem morphology and altered vasculartissue structure. Plants transformed with transcription factor genesthat modify stem morphology or lignin content may be used to affectoverall plant architecture and the distribution of lignified fiber cellswithin the stem.

Modulating lignin content might allow the quality of wood used forfurniture or construction to be improved. Lignin is energy rich;increasing lignin composition could therefore be valuable in raising theenergy content of wood used for fuel. Conversely, the pulp and paperindustries seek wood with a reduced lignin content. Currently, ligninmust be removed in a costly process that involves the use of manypolluting chemicals. Consequently, lignin is a serious barrier toefficient pulp and paper production (Tzfira et al. (1998) TIBTECH 16:439-446; Robinson (1999) Nature Biotechnology 17: 27-30). In addition toforest biotechnology applications, changing lignin content byselectively expressing or repressing transcription factors in fruits andvegetables might increase their palatability.

Transcription factors that modify stem structure, including G438, G748and their equivalogs, may also be used to achieve reduction ofhigher-order shoot development, resulting in significant plantarchitecture modification. Overexpression of the genes that encode thesetranscription factors in woody plants might result in trees that lackside branches, and have fewer knots in the wood. Altering branchingpatterns could also have applications amongst ornamental andagricultural crops. For example, applications might exist in any specieswhere secondary shoots currently have to be removed manually, or wherechanges in branching pattern could increase yield or facilitate moreefficient harvesting.

General development and morphology: altered root development. Bymodifying the structure or development of roots by transforming into aplant one or more of the presently disclosed transcription factor genes,including G9, G225, G226, G1482, and their equivalogs, plants may beproduced that have the capacity to thrive in otherwise unproductivesoils. For example, grape roots extending further into rocky soils wouldprovide greater anchorage, greater coverage with increased branching, orwould remain viable in waterlogged soils, thus increasing the effectiveplanting range of the crop and/or increasing yield and survival. It maybe advantageous to manipulate a plant to produce short roots, as when asoil in which the plant will be growing is occasionally flooded, or whenpathogenic fungi or disease-causing nematodes are prevalent.

In addition, presently disclosed transcription factors including G225;G226; G682; G1816; G2718 and their equivalogs may be used to increaseroot hair density and thus increase tolerance to abiotic stresses,thereby improving yield and quality.

General development and morphology: seed development, ripening andgermination rate. A number of the presently disclosed transcriptionfactor genes (e.g., G979) have been shown to modify seed development andgermination rate, including when the seeds are in conditions normallyunfavorable for germination (e.g., cold, heat or salt stress, or in thepresence of ABA), and may, along with functional equivalogs, thus beused to modify and improve germination rates under adverse conditions.

General development and morphology: cell differentiation and cellproliferation. Several of the disclosed transcription factors regulatecell proliferation and/or differentiation, including G1540 and itsfunctional equivalogs. Control of these processes could have valuableapplications in plant transformation, cell culture or micro-propagationsystems, as well as in control of the proliferation of particular usefultissues or cell types. Transcription factors that induce theproliferation of undifferentiated cells can be operably linked with aninducible promoter to promote the formation of callus that can be usedfor transformation or production of cell suspension cultures.Transcription factors that prevent cells from differentiating, such asG1540 or its equivalogs, could be used to confer stem cell identity tocultured cells. Transcription factors that promote differentiation ofshoots could be used in transformation or micro-propagation systems,where regeneration of shoots from callus is currently problematic. Inaddition, transcription factors that regulate the differentiation ofspecific tissues could be used to increase the proportion of thesetissues in a plant. Genes that promote the differentiation of carpeltissue could be introduced into commercial species to induce formationof increased numbers of carpels or fruits. A particular applicationmight exist in saffron, one of the world's most expensive spices.Saffron filaments, or threads, are actually the dried stigmas of thesaffron flower, Crocus sativus Linneaus. Each flower contains only threestigmas, and more than 75,000 of these flowers are needed to producejust one pound of saffron filaments. An increase in carpel number wouldincrease the quantity of stigmatic tissue and improve yield.

General development and morphology: cell expansion. Plant growth resultsfrom a combination of cell division and cell expansion. Transcriptionfactors such as G521 or its equivalogs may be useful in regulation ofcell expansion. Altered regulation of cell expansion could affect stemlength, an important agronomic characteristic. For instance, shortcultivars of wheat contributed to the Green Revolution, because plantsthat put fewer resources into stem elongation allocate more resourcesinto developing seed and produce higher yield. These plants are alsoless vulnerable to wind and rain damage. These cultivars were found tobe altered in their sensitivity to gibberellins, hormones that regulatestem elongation through control of both cell expansion and celldivision. Altered cell expansion in leaves could also produce novel andornamental plant forms.

General development and morphology: phase change and floral reversion.Transcription factors that regulate phase change can modulate thedevelopmental programs of plants and regulate developmental plasticityof the shoot meristem. In particular, these genes might be used tomanipulate seasonality and influence whether plants display an annual orperennial habit.

General development and morphology: rapid growth and/or development. Anumber of the presently disclosed transcription factor genes have beenshown to have significant effects on plant growth rate and development.These observations have included, for example, more rapid or delayedgrowth and development of reproductive organs. Thus, by causing morerapid development, genes that induce rapid growth or development andtheir functional equivalogs would prove useful for regions with shortgrowing seasons; other transcription factors that delay development maybe useful for regions with longer growing seasons. Accelerating plantgrowth would also improve early yield or increase biomass at an earlierstage, when such is desirable (for example, in producing forestryproducts or vegetable sprouts for consumption). Transcription factorsthat promote faster development such as G807 and its functionalequivalogs may also be used to modify the reproductive cycle of plants.

General development and morphology: slow growth rate. A number of thepresently disclosed transcription factor genes, including G447, G740,G1062, G1335, G1468, and G1474, have been shown to have significanteffects on retarding plant growth rate and development. Theseobservations have included, for example, delayed growth and developmentof reproductive organs. Slow growing plants may be highly desirable toornamental horticulturists, both for providing house plants that displaylittle change in their appearance over time, or outdoor plants for whichwild-type or rapid growth is undesirable (e.g., ornamental palm trees).Slow growth may also provide for a prolonged fruiting period, thusextending the harvesting season, particularly in regions with longgrowing seasons. Slow growth could also provide a prolonged period inwhich pollen is available for improved self- or cross-fertilization, orcross-fertilization of cultivars that normally flower overnon-overlapping time periods. The latter aspect may be particularlyuseful to plants comprising two or more distinct grafted cultivars(e.g., fruit trees) with normally non-overlapping flowering periods.

General development and morphology: senescence. Presently disclosedtranscription factor genes may be used to alter senescence responses inplants. Although leaf senescence is thought to be an evolutionaryadaptation to recycle nutrients, the ability to control senescence in anagricultural setting has significant value. For example, a delay in leafsenescence in some maize hybrids is associated with a significantincrease in yields and a delay of a few days in the senescence ofsoybean plants can have a large impact on yield. In an experimentalsetting, tobacco plants engineered to inhibit leaf senescence had alonger photosynthetic lifespan, and produced a 50% increase in dryweight and seed yield (Gan and Amasino (1995) Science 270: 1986-1988).Delayed flower senescence caused by overexpression of transcriptionfactors (e.g., G249, G571, G878, G1050 or their equivalogs) may generateplants that retain their blossoms longer and this may be of potentialinterest to the ornamental horticulture industry, and delayed foliar andfruit senescence could improve post-harvest shelf-life of produce.

Premature senescence caused by, for example, G636, G1128 and theirequivalogs may be used to improve a plant's response to disease andhasten fruit ripening.

Growth rate and development: lethality and necrosis. Overexpression oftranscription factors, for example, G24, G374, G515; G578, G877, G1076,G1304 and their equivalogs that have a role in regulating cell death maybe used to induce lethality in specific tissues or necrosis in responseto pathogen attack. For example, if a transcription factor gene inducinglethality or necrosis was specifically active in gametes or reproductiveorgans, its expression in these tissues would lead to ablation andsubsequent male or female sterility. Alternatively, underpathogen-regulated expression, a necrosis-inducing transcription factorcan restrict the spread of a pathogen infection through a plant.

Plant size: large plants. Plants overexpressing G46, G624, G1073, G1435,G1451, and G1468, for example, have been shown to be larger thancontrols. For some ornamental plants, the ability to provide largervarieties with these genes or their equivalogs may be highly desirable.For many plants, including fruit-bearing trees, trees that are used forlumber production, or trees and shrubs that serve as view or windscreens, increased stature provides improved benefits in the forms ofgreater yield or improved screening. Crop species may also producehigher yields on larger cultivars, particularly those in which thevegetative portion of the plant is edible.

Plant size: large seedlings. Presently disclosed transcription factorgenes, that produce large seedlings can be used to produce crops thatbecome established faster. Large seedlings are generally hardier, lessvulnerable to stress, and better able to out-compete weed species.Seedlings transformed with presently disclosed transcription factors,including G1313, for example, have been shown to possess largercotyledons and were more developmentally advanced than control plants.Rapid seedling development made possible by manipulating expression ofthese genes or their equivalogs is likely to reduce loss due to diseasesparticularly prevalent at the seedling stage (e.g., damping off) and isthus important for survivability of plants germinating in the field orin controlled environments.

Plant size: dwarfed plants. Presently disclosed transcription factorgenes, including G24 and many others and their equivalogs, for example,that can be used to decrease plant stature are likely to produce plantsthat are more resistant to damage by wind and rain, have improvedlodging resistance, or more resistant to heat or low humidity or waterdeficit. Dwarf plants are also of significant interest to the ornamentalhorticulture industry, and particularly for home garden applications forwhich space availability may be limited.

Plant size: fruit size and number. Introduction of presently disclosedtranscription factor genes that affect fruit size will have desirableimpacts on fruit size and number, which may comprise increases in yieldfor fruit crops, or reduced fruit yield, such as when vegetative growthis preferred (e.g., with bushy ornamentals, or where fruit isundesirable, as with ornamental olive trees).

Leaf morphology: dark leaves. Color-affecting components in leavesinclude chlorophylls (generally green), anthocyanins (generally red toblue) and carotenoids (generally yellow to red). Transcription factorgenes that increase these pigments in leaves, including G385, G447,G912, G932, G977, G1128, G1267, G1323, G1327, G1334, G1499, and theirequivalogs, may positively affect a plant's value to the ornamentalhorticulture industry. Variegated varieties, in particular, would showimproved contrast. Other uses that result from overexpression oftranscription factor genes include improvements in the nutritional valueof foodstuffs. For example, lutein is an important nutraceutical;lutein-rich diets have been shown to help prevent age-related maculardegeneration (ARMD), the leading cause of blindness in elderly people.Consumption of dark green leafy vegetables has been shown in clinicalstudies to reduce the risk of ARMD.

Enhanced chlorophyll and carotenoid levels could also improve yield incrop plants. Lutein, like other xanthophylls such as zeaxanthin andviolaxanthin, is an essential component in the protection of the plantagainst the damaging effects of excessive light. Specifically, luteincontributes, directly or indirectly, to the rapid rise ofnon-photochemical quenching in plants exposed to high light. Crop plantsengineered to contain higher levels of lutein could therefore haveimproved photo-protection, leading to less oxidative damage and bettergrowth under high light (e.g., during long summer days, or at higheraltitudes or lower latitudes than those at which a non-transformed plantwould survive). Additionally, elevated chlorophyll levels increasesphotosynthetic capacity.

Leaf morphology: changes in leaf shape. Presently disclosedtranscription factors produce marked and diverse effects on leafdevelopment and shape. The transcription factors include G32; G224;G428; G464; G629; G671; G736; G903; G905; G921; G932; G977; G1038;G1067; G1073; G1075; G1269; G1493; G1645, G1468, and their equivalogs.At early stages of growth, transgenic seedlings have developed narrow,upward pointing leaves with long petioles, possibly indicating adisruption in circadian-clock controlled processes or nyctinasticmovements. Other transcription factor genes can be used to alter leafshape in a significant manner from wild type, some of which may find usein ornamental applications.

Leaf morphology: altered leaf size. Large leaves, such as those producedin plants overexpressing G438, G1274, G1451 and their functionalequivalogs, generally increase plant biomass. This provides benefit forcrops where the vegetative portion of the plant is the marketableportion.

Leaf morphology: light green, gray and variegated leaves. Transcriptionfactor genes that provide an altered appearance, including G1468 and itsequivalogs, may positively affect a plant's value to the ornamentalhorticulture industry.

Leaf morphology: glossy leaves. Transcription factor genes such as G1267and its equivalogs that induce the formation of glossy leaves generallydo so by elevating levels of epidermal wax. Thus, the genes could beused to engineer changes in the composition and amount of leaf surfacecomponents, including waxes. The ability to manipulate wax composition,amount, or distribution could modify plant tolerance to drought and lowhumidity, or resistance to insects or pathogens. Additionally, wax maybe a valuable commodity in some species, and altering its accumulationand/or composition could enhance yield.

Seed morphology: altered seed coloration. Presently disclosedtranscription factor genes, including G156, and G668 have been used tomodify seed color, which, along with the equivalogs of these genes,could provide added appeal to seeds or seed products.

Seed morphology: altered seed size and shape. The introduction ofpresently disclosed transcription factor genes into plants that increase(e.g., G206,G584,G1255) or decrease (e.g., G6145). the size of seeds mayhave a significant impact on yield and appearance, particularly when theproduct is the seed itself (e.g., in the case of grains, legumes, nuts,etc.). Seed size, in addition to seed coat integrity, thickness andpermeability, seed water content and a number of other componentsincluding antioxidants and oligosaccharides, also affects affect seedlongevity in storage, with larger seeds often being more desirable forprolonged storage.

Transcription factor genes that alter seed shape, including G1062, G1145and their equivalogs may have both ornamental applications and improveor broaden the appeal of seed products.

Leaf biochemistry: increased leaf wax. Overexpression of transcriptionfactors genes, including G975 and its equivalogs, which results inincreased leaf wax could be used to manipulate wax composition, amount,or distribution. These transcription factors can improve yield in thoseplants and crops from which wax is a valuable product. The genes mayalso be used to modify plant tolerance to drought and/or low humidity orresistance to insects, as well as plant appearance (glossy leaves). Theeffect of increased wax deposition on leaves of a plant like may improvewater use efficiency. Manipulation of these genes may reduce the waxcoating on sunflower seeds; this wax fouls the oil extraction systemduring sunflower seed processing for oil. For the latter purpose or anyother where wax reduction is valuable, antisense or cosuppression of thetranscription factor genes in a tissue-specific manner would bevaluable.

Leaf biochemistry: leaf prenyl lipids, including tocopherol. Prenyllipids play a role in anchoring proteins in membranes or membranousorganelles. Thus modifying the prenyl lipid content of seeds and leavescould affect membrane integrity and function. One important group ofprenyl lipids, the tocopherols, have both anti-oxidant and vitamin Eactivity. A number of presently disclosed transcription factor genes,including G214, G280, G987, G1133, G1324, and G1328 have been shown tomodify the prenyl lipid content of leaves in plants, and these genes andtheir equivalogs may thus be used to alter prenyl lipid content ofleaves.

Leaf biochemistry: altered leaf insoluble sugars. Overexpression of anumber of presently disclosed transcription factors, including G211,G237, G242, G274, G307,G428, G435, G525, G598, G777, G869, G1012, andG1309, resulted in plants with altered leaf insoluble sugar content.This transcription factor and its equivalogs that alter plant cell wallcomposition have several potential applications including altering fooddigestibility, plant tensile strength, wood quality, pathogen resistanceand in pulp production. In particular, hemicellulose is not desirable inpaper pulps because of its lack of strength compared with cellulose.Thus modulating the amounts of cellulose vs. hemicellulose in the plantcell wall is desirable for the paper/lumber industry. Increasing theinsoluble carbohydrate content in various fruits, vegetables, and otheredible consumer products will result in enhanced fiber content.Increased fiber content would not only provide health benefits in foodproducts, but might also increase digestibility of forage crops. Inaddition, the hemicellulose and pectin content of fruits and berriesaffects the quality of jam and catsup made from them. Changes inhemicellulose and pectin content could result in a superior consumerproduct.

Leaf biochemistry: increased leaf anthocyanin. Several presentlydisclosed transcription factor genes, including G663 and its equivalogs,may be used to alter anthocyanin production in numerous plant species.Expression of presently disclosed transcription factor genes thatincrease flavonoid production in plants, including anthocyanins andcondensed tannins, may be used to alter in pigment production forhorticultural purposes, and possibly increasing stress resistance. Anumber of flavonoids have been shown to have antimicrobial activity andcould be used to engineer pathogen resistance. Several flavonoidcompounds have health promoting effects such as inhibition of tumorgrowth, prevention of bone loss and prevention of the oxidation oflipids. Increased levels of condensed tannins, in forage legumes wouldbe an important agronomic trait because they prevent pasture bloat bycollapsing protein foams within the rumen. For a review on the utilitiesof flavonoids and their derivatives, refer to Dixon et al. (1999) TrendsPlant Sci. 4: 394-400.

Leaf and seed biochemistry altered fatty acid content. A number of thepresently disclosed transcription factor genes have been shown to alterthe fatty acid composition in plants, and seeds and leaves inparticular. This modification suggests several utilities, includingimproving the nutritional value of seeds or whole plants. Dietary fattyacids ratios have been shown to have an effect on, for example, boneintegrity and remodeling (see, for example, Weiler (2000) Pediatr. Res.47:5 692-697). The ratio of dietary fatty acids may alter the precursorpools of long-chain polyunsaturated fatty acids that serve as precursorsfor prostaglandin synthesis. In mammalian connective tissue,prostaglandins serve as important signals regulating the balance betweenresorption and formation in bone and cartilage. Thus dietary fatty acidratios altered in seeds may affect the etiology and outcome of boneloss.

Transcription factors that reduce leaf fatty acids, for example, 16:3fatty acids, may be used to control thylakoid membrane development,including proplastid to chloroplast development. The genes that encodethese transcription factors (e.g., G718, G1266, and G1347) might thus beuseful for controlling the transition from proplastid to chromoplast infruits and vegetables. It may also be desirable to change the expressionof these genes to prevent cotyledon greening in Brassica napus or B.campestris to avoid green oil due to early frost.

Transcription factor genes that increase leaf fatty acid production,including G214 and G231, could potentially be used to manipulate seedcomposition, which is very important for the nutritional value andproduction of various food products. A number of transcription factorgenes are involved in mediating an aspect of the regulatory response totemperature. These genes may be used to alter the expression ofdesaturases that lead to production of 18:3 and 16:3 fatty acids, thebalance of which affects membrane fluidity and mitigates damage to cellmembranes and photosynthetic structures at high and low temperatures.

Leaf and seed biochemistry: glucosinolates. A number of glucosinolateshave been shown to have anti-cancer activity; thus, increasing thelevels or composition of these compounds by introducing several of thepresently disclosed transcription factors, including G185, G681, G1069;G1198, and G1322, can have a beneficial effect on human diet.

Glucosinolates are undesirable components of the oilseeds used in animalfeed since they produce toxic effects. Low-glucosinolate varieties ofcanola, for example, have been developed to combat this problem.Glucosinolates form part of a plant's natural defense against insects.Modification of glucosinolate composition or quantity by introducingtranscription factors that affect these characteristics can thereforeafford increased protection from herbivores. Furthermore, in ediblecrops, tissue specific promoters can be used to ensure that thesecompounds accumulate specifically in tissues, such as the epidermis,which are not taken for consumption.

Leaf and seed biochemistry: production of seed and leaf phytosterols:Presently disclosed transcription factor genes that modify levels ofphytosterols in plants may have at least two utilities. First,phytosterols are an important source of precursors for the manufactureof human steroid hormones. Thus, regulation of transcription factorexpression or activity could lead to elevated levels of important humansteroid precursors for steroid semi-synthesis. For example,transcription factors that cause elevated levels of campesterol inleaves, or sitosterols and stigmasterols in seed crops, would be usefulfor this purpose. Phytosterols and their hydrogenated derivativesphytostanols also have proven cholesterol-lowering properties, andtranscription factor genes that modify the expression of these compoundsin plants would thus provide health benefits.

Seed biochemistry: modified seed oil and fatty acid content. Thecomposition of seeds, particularly with respect to seed oil amountsand/or composition, is very important for the nutritional and caloricvalue and production of various food and feed products. Several of thepresently disclosed transcription factor genes in seed lipid saturationthat alter seed oil content could be used to improve the heat stabilityof oils or to improve the nutritional quality of seed oil, by, forexample, reducing the number of calories in seed by decreasing oil orfatty acid content (e.g., G180, G192, G201, G222, G241, G663, G668,G718, G732, G777, G911, G1323, and G1820), increasing the number ofcalories in animal feeds by increasing oil or fatty acid content (e.g.G162, G229, G231, G291, G456, G464, G561, G590, G598, G732, G849, G961,G6190, and G1198), or altering seed oil content (G509,G567, G732, G974,G1451, and G1471).

Seed biochemistry: modified seed protein content. As with seed oils, thecomposition of seeds, particularly with respect to protein amountsand/or composition, is very important for the nutritional value andproduction of various food and feed products. A number of the presentlydisclosed transcription factor genes modify the protein concentrationsin seeds, including G201, G222, G226, G241, G629, G630, G663, G668,G718, G732, G865, G911, G1048, G1323, G1449, and G1820, which increaseseed protein, G229, G231, G418, G456, G464, G732, and G1634, whichdecrease seed protein, and G162, G509, G567, G732, and G849, which alterseed protein content, would provide nutritional benefits, and may beused to prolong storage, increase seed pest or disease resistance, ormodify germination rates.

Seed biochemistry: seed prenyl lipids. Prenyl lipids play a role inanchoring proteins in membranes or membranous organelles. Thus,presently disclosed transcription factor genes and their equivalogs thatmodify the prenyl lipid content of seeds and leaves could affectmembrane integrity and function. A number of presently disclosedtranscription factor genes, including G214, G718, G748, G883, and G1052,have been shown to modify the tocopherol composition of plants.α-Tocopherol is better known as vitamin E. Tocopherols such as α- andγ-tocopherol both have anti-oxidant activity.

Seed biochemistry: increased seed anthocyanin. Several presentlydisclosed transcription factor genes, including G663 and its equivalogs,may be used to alter anthocyanin production in the seeds of plants. Aswith leaf anthocyanins, expression of presently disclosed transcriptionfactor genes that increase flavonoid (anthocyanins and condensedtannins) production in seeds, including G663 and its equivalogs, may beused to alter in pigment production for horticultural purposes, andpossibly increasing stress resistance, antimicrobial activity and healthpromoting effects such as inhibition of tumor growth, prevention of boneloss and prevention of the oxidation of lipids.

Root biochemistry: increased root anthocyanin. Presently disclosedtranscription factor genes, including G663, may be used to alteranthocyanin production in the root of plants. As described above forseed anthocyanins, expression of presently disclosed transcriptionfactor genes that increase flavonoid (anthocyanins and condensedtannins) production in seeds, including G663 and its equivalogs, may beused to alter in pigment production for horticultural purposes, andpossibly increasing stress resistance, antimicrobial activity and healthpromoting effects such as inhibition of tumor growth, prevention of boneloss and prevention of the oxidation of lipids.

Light response/shade avoidance: altered cotyledon, hypocotyl, petioledevelopment, altered leaf orientation, constitutive photomorphogenesis,photomorphogenesis in low light. Presently disclosed transcriptionfactor genes, including G351, G1062, and G1322, that modify a plant'sresponse to light may be useful for modifying plant growth ordevelopment, for example, photomorphogenesis in poor light, oraccelerating flowering time in response to various light intensities,quality or duration to which a non-transformed plant would not similarlyrespond. Examples of such responses that have been demonstrated includeleaf number and arrangement, and early flower bud appearances.Elimination of shading responses may lead to increased plantingdensities with subsequent yield enhancement. As these genes may alsoalter plant architecture, they may find use in the ornamentalhorticulture industry.

Pigment: increased anthocyanin level in various plant organs andtissues. In addition to seed, leaves and roots, as mentioned above,several presently disclosed transcription factor genes (i.e., G663 andequivalogs) can be used to alter anthocyanin levels in one or moretissues, depending on the organ in which these genes are expressed. Thepotential utilities of these genes include alterations in pigmentproduction for horticultural purposes, and possibly increasing stressresistance, antimicrobial activity and health promoting effects such asinhibition of tumor growth, prevention of bone loss and prevention ofthe oxidation of lipids.

Miscellaneous biochemistry: diterpenes in leaves and other plant parts.Depending on the plant species, varying amounts of diverse secondarybiochemicals (often lipophilic terpenes) are produced and exuded orvolatilized by trichomes. These exotic secondary biochemicals, which arerelatively easy to extract because they are on the surface of the leaf,have been widely used in such products as flavors and aromas, drugs,pesticides and cosmetics. Thus, the overexpression of genes that areused to produce diterpenes in plants may be accomplished by introducingtranscription factor genes that induce said overexpression. One class ofsecondary metabolites, the diterpenes, can effect several biologicalsystems such as tumor progression, prostaglandin synthesis and tissueinflammation. In addition, diterpenes can act as insect pheromones,termite allomones, and can exhibit neurotoxic, cytotoxic and antimitoticactivities. As a result of this functional diversity, diterpenes havebeen the target of research several pharmaceutical ventures. In mostcases where the metabolic pathways are impossible to engineer,increasing trichome density or size on leaves may be the only way toincrease plant productivity.

Miscellaneous Biochemistry: Production of Miscellaneous SecondaryMetabolites.

Microarray data suggests that flux through the aromatic amino acidbiosynthetic pathways and primary and secondary metabolite biosyntheticpathways are up-regulated. Gene coding for enzymes involved in alkaloidbiosynthesis include indole-3-glycerol phosphatase and strictosidinesynthase are induced in G229 overexpressors. Genes for enzymes involvedin aromatic amino acid biosynthesis are also up-regulated includingtryptophan synthase and tyrosine transaminase. Phenylalanine ammonialyase, chalcone synthase and trans-cinnamate mono-oxygenase are alsoinduced and are involved in phenylpropenoid biosynthesis.

Antisense and Co-Suppression

In addition to expression of the nucleic acids of the invention as genereplacement or plant phenotype modification nucleic acids, the nucleicacids are also useful for sense and anti-sense suppression ofexpression, e.g., to down-regulate expression of a nucleic acid of theinvention, e.g., as a further mechanism for modulating plant phenotype.That is, the nucleic acids of the invention, or subsequences oranti-sense sequences thereof, can be used to block expression ofnaturally occurring homologous nucleic acids. A variety of sense andanti-sense technologies are known in the art, e.g., as set forth inLichtenstein and Nellen (1997) Antisense Technology: A PracticalApproach IRL Press at Oxford University Press, Oxford, U.K. Antisenseregulation is also described in Crowley et al. (1985) Cell 43: 633-641;Rosenberg et al. (1985) Nature 313: 703-706; Preiss et al. (1985) Nature313: 27-32; Melton (1985) Proc. Natl. Acad. Sci. 82: 144-148; Izant andWeintraub (1985) Science 229: 345-352; and Kim and Wold (1985) Cell 42:129-138. Additional methods for antisense regulation are known in theart. Antisense regulation has been used to reduce or inhibit expressionof plant genes in, for example in European Patent Publication No.271988. Antisense RNA may be used to reduce gene expression to produce avisible or biochemical phenotypic change in a plant (Smith et al. (1988)Nature, 334: 724-726; Smith et al. (1990) Plant Mol. Biol. 14: 369-379).In general, sense or anti-sense sequences are introduced into a cell,where they are optionally amplified, e.g., by transcription. Suchsequences include both simple oligonucleotide sequences and catalyticsequences such as ribozymes.

For example, a reduction or elimination of expression (i.e., a“knock-out”) of a transcription factor or transcription factor homologpolypeptide in a transgenic plant, e.g., to modify a plant trait, can beobtained by introducing an antisense construct corresponding to thepolypeptide of interest as a cDNA. For antisense suppression, thetranscription factor or homolog cDNA is arranged in reverse orientation(with respect to the coding sequence) relative to the promoter sequencein the expression vector. The introduced sequence need not be the fulllength cDNA or gene, and need not be identical to the cDNA or gene foundin the plant type to be transformed. Typically, the antisense sequenceneed only be capable of hybridizing to the target gene or RNA ofinterest. Thus, where the introduced sequence is of shorter length, ahigher degree of homology to the endogenous transcription factorsequence will be needed for effective antisense suppression. Whileantisense sequences of various lengths can be utilized, preferably, theintroduced antisense sequence in the vector will be at least 30nucleotides in length, and improved antisense suppression will typicallybe observed as the length of the antisense sequence increases.Preferably, the length of the antisense sequence in the vector will begreater than 100 nucleotides. Transcription of an antisense construct asdescribed results in the production of RNA molecules that are thereverse complement of mRNA molecules transcribed from the endogenoustranscription factor gene in the plant cell.

Suppression of endogenous transcription factor gene expression can alsobe achieved using a ribozyme. Ribozymes are RNA molecules that possesshighly specific endoribonuclease activity. The production and use ofribozymes are disclosed in U.S. Pat. No. 4,987,071 and U.S. Pat. No.5,543,508. Synthetic ribozyme sequences including antisense RNAs can beused to confer RNA cleaving activity on the antisense RNA, such thatendogenous mRNA molecules that hybridize to the antisense RNA arecleaved, which in turn leads to an enhanced antisense inhibition ofendogenous gene expression.

Vectors in which RNA encoded by a transcription factor or transcriptionfactor homolog cDNA is over-expressed can also be used to obtainco-suppression of a corresponding endogenous gene, e.g., in the mannerdescribed in U.S. Pat. No. 5,231,020 to Jorgensen. Such co-suppression(also termed sense suppression) does not require that the entiretranscription factor cDNA be introduced into the plant cells, nor doesit require that the introduced sequence be exactly identical to theendogenous transcription factor gene of interest. However, as withantisense suppression, the suppressive efficiency will be enhanced asspecificity of hybridization is increased, e.g., as the introducedsequence is lengthened, and/or as the sequence similarity between theintroduced sequence and the endogenous transcription factor gene isincreased.

Vectors expressing an untranslatable form of the transcription factormRNA, e.g., sequences comprising one or more stop codon, or nonsensemutation) can also be used to suppress expression of an endogenoustranscription factor, thereby reducing or eliminating its activity andmodifying one or more traits. Methods for producing such constructs aredescribed in U.S. Pat. No. 5,583,021. Preferably, such constructs aremade by introducing a premature stop codon into the transcription factorgene. Alternatively, a plant trait can be modified by gene silencingusing double-strand RNA (Sharp (1999) Genes and Development 13:139-141). Another method for abolishing the expression of a gene is byinsertion mutagenesis using the T-DNA of Agrobacterium tumefaciens.After generating the insertion mutants, the mutants can be screened toidentify those containing the insertion in a transcription factor ortranscription factor homolog gene. Plants containing a single transgeneinsertion event at the desired gene can be crossed to generatehomozygous plants for the mutation. Such methods are well known to thoseof skill in the art (See for example Koncz et al. (1992) Methods inArabidopsis Research, World Scientific Publishing Co. Pte. Ltd., RiverEdge, N.J.).

Alternatively, a plant phenotype can be altered by eliminating anendogenous gene, such as a transcription factor or transcription factorhomolog, e.g., by homologous recombination (Kempin et al. (1997) Nature389: 802-803).

A plant trait can also be modified by using the Cre-lox system (forexample, as described in U.S. Pat. No. 5,658,772). A plant genome can bemodified to include first and second lox sites that are then contactedwith a Cre recombinase. If the lox sites are in the same orientation,the intervening DNA sequence between the two sites is excised. If thelox sites are in the opposite orientation, the intervening sequence isinverted.

The polynucleotides and polypeptides of this invention can also beexpressed in a plant in the absence of an expression cassette bymanipulating the activity or expression level of the endogenous gene byother means, such as, for example, by ectopically expressing a gene byT-DNA activation tagging (Ichikawa et al. (1997) Nature 390 698-701;Kakimoto et al. (1996) Science 274: 982-985) method entails transforminga plant with a gene tag containing multiple transcriptional enhancersand once the tag has inserted into the genome, expression of a flankinggene coding sequence becomes deregulated. In another example, thetranscriptional machinery in a plant can be modified so as to increasetranscription levels of a polynucleotide of the invention (See, e.g.,PCT Publications WO 96/06166 and WO 98/53057 which describe themodification of the DNA-binding specificity of zinc finger proteins bychanging particular amino acids in the DNA-binding motif).

The transgenic plant can also include the machinery necessary forexpressing or altering the activity of a polypeptide encoded by anendogenous gene, for example, by altering the phosphorylation state ofthe polypeptide to maintain it in an activated state.

Transgenic plants (or plant cells, or plant explants, or plant tissues)incorporating the polynucleotides of the invention and/or expressing thepolypeptides of the invention can be produced by a variety of wellestablished techniques as described above. Following construction of avector, most typically an expression cassette, including apolynucleotide, e.g., encoding a transcription factor or transcriptionfactor homolog, of the invention, standard techniques can be used tointroduce the polynucleotide into a plant, a plant cell, a plant explantor a plant tissue of interest. Optionally, the plant cell, explant ortissue can be regenerated to produce a transgenic plant.

The plant can be any higher plant, including gymnosperms,monocotyledonous and dicotyledenous plants. Suitable protocols areavailable for Leguminosae (alfalfa, soybean, clover, etc.), Umbelliferae(carrot, celery, parsnip), Cruciferae (cabbage, radish, rapeseed,broccoli, etc.), Curcurbitaceae (melons and cucumber), Gramineae (wheat,corn, rice, barley, millet, etc.), Solanaceae (potato, tomato, tobacco,peppers, etc.), and various other crops. See protocols described inAmmirato et al., Eds., (1984) Handbook of Plant Cell Culture—CropSpecies, Macmillan Publ. Co., New York, N.Y.; Shimamoto et al. (1989)Nature 338: 274-276; Fromm et al. (1990) Bio/Technol. 8: 833-839; andVasil et al. (1990) Bio/Technol. 8: 429-434.

Transformation and regeneration of both monocotyledonous anddicotyledonous plant cells is now routine, and the selection of the mostappropriate transformation technique will be determined by thepractitioner. The choice of method will vary with the type of plant tobe transformed; those skilled in the art will recognize the suitabilityof particular methods for given plant types. Suitable methods caninclude, but are not limited to: electroporation of plant protoplasts;liposome-mediated transformation; polyethylene glycol (PEG) mediatedtransformation; transformation using viruses; micro-injection of plantcells; micro-projectile bombardment of plant cells; vacuum infiltration;and Agrobacterium tumefaciens mediated transformation. Transformationmeans introducing a nucleotide sequence into a plant in a manner tocause stable or transient expression of the sequence.

Successful examples of the modification of plant characteristics bytransformation with cloned sequences which serve to illustrate thecurrent knowledge in this field of technology, and which are hereinincorporated by reference, include: U.S. Pat. Nos. 5,571,706; 5,677,175;5,510,471; 5,750,386; 5,597,945; 5,589,615; 5,750,871; 5,268,526;5,780,708; 5,538,880; 5,773,269; 5,736,369 and 5,610,042.

Following transformation, plants are preferably selected using adominant selectable marker incorporated into the transformation vector.Typically, such a marker will confer antibiotic or herbicide resistanceon the transformed plants, and selection of transformants can beaccomplished by exposing the plants to appropriate concentrations of theantibiotic or herbicide.

After transformed plants are selected and grown to maturity, thoseplants showing a modified trait are identified. The modified trait canbe any of those traits described above. Additionally, to confirm thatthe modified trait is due to changes in expression levels or activity ofthe polypeptide or polynucleotide of the invention can be determined byanalyzing mRNA expression using Northern blots, RT-PCR or microarrays,or protein expression using immunoblots or Western blots or gel shiftassays.

Integrated Systems—Sequence Identity

Additionally, the present invention may be an integrated system,computer or computer readable medium that comprises an instruction setfor determining the identity of one or more sequences in a database. Inaddition, the instruction set can be used to generate or identifysequences that meet any specified criteria. Furthermore, the instructionset may be used to associate or link certain functional benefits, suchimproved characteristics, with one or more identified sequence.

For example, the instruction set can include, e.g., a sequencecomparison or other alignment program, e.g., an available program suchas, for example, the Wisconsin Package Version 10.0, such as BLAST,FASTA, PILEUP, FINDPATTERNS or the like (GCG, Madison, Wis.). Publicsequence databases such as GenBank, EMBL, Swiss-Prot and PIR or privatesequence databases such as PHYTOSEQ sequence database (Incyte Genomics,Palo Alto, Calif.) can be searched.

Alignment of sequences for comparison can be conducted by the localhomology algorithm of Smith and Waterman (1981) Adv. Appl. Math. 2:482-489, by the homology alignment algorithm of Needleman and Wunsch(1970) J. Mol. Biol. 48: 443-453, by the search for similarity method ofPearson and Lipman (1988) Proc. Natl. Acad. Sci. 85: 2444-2448, bycomputerized implementations of these algorithms. After alignment,sequence comparisons between two (or more) polynucleotides orpolypeptides are typically performed by comparing sequences of the twosequences over a comparison window to identify and compare local regionsof sequence similarity. The comparison window can be a segment of atleast about 20 contiguous positions, usually about 50 to about 200, moreusually about 100 to about 150 contiguous positions. A description ofthe method is provided in Ausubel et al. supra.

A variety of methods for determining sequence relationships can be used,including manual alignment and computer assisted sequence alignment andanalysis. This later approach is a preferred approach in the presentinvention, due to the increased throughput afforded by computer assistedmethods. As noted above, a variety of computer programs for performingsequence alignment are available, or can be produced by one of skill.

One example algorithm that is suitable for determining percent sequenceidentity and sequence similarity is the BLAST algorithm, which isdescribed in Altschul et al. (1990) J. Mol. Biol. 215: 403-410. Softwarefor performing BLAST analyses is publicly available, e.g., through theNational Library of Medicine's National Center for BiotechnologyInformation (ncbi.nlm.nih; see at world wide web (www) NationalInstitutes of Health US government (gov) website). This algorithminvolves first identifying high scoring sequence pairs (HSPs) byidentifying short words of length W in the query sequence, which eithermatch or satisfy some positive-valued threshold score T when alignedwith a word of the same length in a database sequence. T is referred toas the neighborhood word score threshold (Altschul et al. supra). Theseinitial neighborhood word hits act as seeds for initiating searches tofind longer HSPs containing them. The word hits are then extended inboth directions along each sequence for as far as the cumulativealignment score can be increased. Cumulative scores are calculatedusing, for nucleotide sequences, the parameters M (reward score for apair of matching residues; always >0) and N (penalty score formismatching residues; always <0). For amino acid sequences, a scoringmatrix is used to calculate the cumulative score. Extension of the wordhits in each direction are halted when: the cumulative alignment scorefalls off by the quantity X from its maximum achieved value; thecumulative score goes to zero or below, due to the accumulation of oneor more negative-scoring residue alignments; or the end of eithersequence is reached. The BLAST algorithm parameters W, T, and Xdetermine the sensitivity and speed of the alignment. The BLASTN program(for nucleotide sequences) uses as defaults a wordlength (W) of 11, anexpectation (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison ofboth strands. For amino acid sequences, the BLASTP program uses asdefaults a wordlength (W) of 3, an expectation (E) of 10, and theBLOSUM62 scoring matrix (see Henikoff and Henikoff (1992) Proc. Natl.Acad. Sci. 89: 10915-10919). Unless otherwise indicated, “sequenceidentity” here refers to the % sequence identity generated from atblastx using the NCBI version of the algorithm at the default settingsusing gapped alignments with the filter “off” (see, for example, NIH NLMNCBI website at ncbi.nlm.nih, supra).

In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences (see, e.g. Karlin and Altschul (1993) Proc. Natl. Acad.Sci. 90: 5873-5787). One measure of similarity provided by the BLASTalgorithm is the smallest sum probability (P(N)), which provides anindication of the probability by which a match between two nucleotide oramino acid sequences would occur by chance. For example, a nucleic acidis considered similar to a reference sequence (and, therefore, in thiscontext, homologous) if the smallest sum probability in a comparison ofthe test nucleic acid to the reference nucleic acid is less than about0.1, or less than about 0.01, and or even less than about 0.001. Anadditional example of a useful sequence alignment algorithm is PILEUP.PILEUP creates a multiple sequence alignment from a group of relatedsequences using progressive, pairwise alignments. The program can align,e.g., up to 300 sequences of a maximum length of 5,000 letters.

The integrated system, or computer typically includes a user inputinterface allowing a user to selectively view one or more sequencerecords corresponding to the one or more character strings, as well asan instruction set which aligns the one or more character strings witheach other or with an additional character string to identify one ormore region of sequence similarity. The system may include a link of oneor more character strings with a particular phenotype or gene function.Typically, the system includes a user readable output element thatdisplays an alignment produced by the alignment instruction set.

The methods of this invention can be implemented in a localized ordistributed computing environment. In a distributed environment, themethods may implemented on a single computer comprising multipleprocessors or on a multiplicity of computers. The computers can belinked, e.g. through a common bus, but more preferably the computer(s)are nodes on a network. The network can be a generalized or a dedicatedlocal or wide-area network and, in certain preferred embodiments, thecomputers may be components of an intra-net or an internet.

Thus, the invention provides methods for identifying a sequence similaror homologous to one or more polynucleotides as noted herein, or one ormore target polypeptides encoded by the polynucleotides, or otherwisenoted herein and may include linking or associating a given plantphenotype or gene function with a sequence. In the methods, a sequencedatabase is provided (locally or across an inter or intra net) and aquery is made against the sequence database using the relevant sequencesherein and associated plant phenotypes or gene functions.

Any sequence herein can be entered into the database, before or afterquerying the database. This provides for both expansion of the databaseand, if done before the querying step, for insertion of controlsequences into the database. The control sequences can be detected bythe query to ensure the general integrity of both the database and thequery. As noted, the query can be performed using a web browser basedinterface. For example, the database can be a centralized publicdatabase such as those noted herein, and the querying can be done from aremote terminal or computer across an internet or intranet.

Any sequence herein can be used to identify a similar, homologous,paralogous, or orthologous sequence in another plant. This providesmeans for identifying endogenous sequences in other plants that may beuseful to alter a trait of progeny plants, which results from crossingtwo plants of different strain. For example, sequences that encode anortholog of any of the sequences herein that naturally occur in a plantwith a desired trait can be identified using the sequences disclosedherein. The plant is then crossed with a second plant of the samespecies but which does not have the desired trait to produce progenywhich can then be used in further crossing experiments to produce thedesired trait in the second plant. Therefore the resulting progeny plantcontains no transgenes; expression of the endogenous sequence may alsobe regulated by treatment with a particular chemical or other means,such as EMR. Some examples of such compounds well known in the artinclude: ethylene; cytokinins; phenolic compounds, which stimulate thetranscription of the genes needed for infection; specificmonosaccharides and acidic environments which potentiate vir geneinduction; acidic polysaccharides which induce one or more chromosomalgenes; and opines; other mechanisms include light or dark treatment (fora review of examples of such treatments, see, Winans (1992) Microbiol.Rev. 56: 12-31; Eyal et al. (1992) Plant Mol. Biol. 19: 589-599;Chrispeels et al. (2000) Plant Mol. Biol. 42: 279-290; Piazza et al.(2002) Plant Physiol. 128: 1077-1086).

Table 7 lists sequences discovered to be orthologous to a number ofrepresentative transcription factors of the present invention. Thecolumn headings include the transcription factors listed by SEQ ID NO;corresponding Gene ID (GID) numbers; the species from which theorthologs to the transcription factors are derived; the type of sequence(i.e., DNA or protein) discovered to be orthologous to the transcriptionfactors; and the SEQ ID NO of the orthologs, the latter corresponding tothe ortholog SEQ ID NOs lisetd in the Sequence Listing. TABLE 7Orthologs of Representative Arabidopsis Transcription Factor Genes SEQID NO: of GID NO of SEQ ID NO: of Ortholog or Sequence type OrthologousNucleotide Encoding Nucleotide Species from used for ArabidopsisOrthologous Encoding Which Ortholog determination TranscriptionArabidopsis Ortholog is Derived (DNA or Protein) Factor TranscriptionFactor 961 Glycine max DNA G8 11 962 Glycine max DNA G8 11 963 Glycinemax DNA G8 11 964 Glycine max DNA G8 11 965 Oryza sativa DNA G8 11 966Oryza sativa DNA G8 11 967 Zea mays DNA G8 11 968 Zea mays DNA G8 11 969Zea mays DNA G8 11 970 Glycine max DNA G19 21 971 Glycine max DNA G19 21972 Glycine max DNA G19 21 973 Glycine max DNA G19 21 974 Oryza sativaDNA G19 21 975 Oryza sativa DNA G19 21 976 Oryza sativa DNA G19 21 977Zea mays DNA G19 21 978 Zea mays DNA G19 21 979 Glycine max DNA G22 27980 Glycine max DNA G22 27 981 Glycine max DNA G24 29 982 Glycine maxDNA G24 29 983 Glycine max DNA G24 29 984 Glycine max DNA G24 29 985Glycine max DNA G24 29 986 Glycine max DNA G24 29 987 Glycine max DNAG24 29 988 Oryza sativa DNA G24 29 989 Oryza sativa PRT G24 29 990 Oryzasativa PRT G24 29 991 Oryza sativa PRT G24 29 992 Zea mays DNA G24 29993 Glycine max DNA G28 37 994 Glycine max DNA G28 37 995 Glycine maxDNA G28 37 996 Glycine max DNA G28 37 997 Glycine max DNA G28 37 998Glycine max DNA G28 37 999 Glycine max DNA G28 37 1000 Glycine max DNAG28 37 1001 Oryza sativa PRT G28 37 1002 Oryza sativa PRT G28 37 1003Zea mays DNA G28 37 1004 Glycine max DNA G46 53 1005 Glycine max DNA G4653 1006 Glycine max DNA G46 53 1007 Glycine max DNA G46 53 1008 Glycinemax DNA G46 53 1009 Glycine max DNA G46 53 1010 Glycine max DNA G46 531011 Glycine max DNA G46 53 1012 Oryza sativa PRT G46 53 1013 Zea maysDNA G46 53 1014 Glycine max DNA G157 69 Glycine max DNA G1842 943Glycine max DNA G1843 945 Glycine max DNA G859 567 1015 Glycine max DNAG180 87 1016 Glycine max DNA G180 87 1017 Oryza sativa DNA G180 87 1018Oryza sativa PRT G180 87 1019 Zea mays DNA G180 87 1020 Glycine max DNAG188 97 1021 Oryza sativa PRT G188 97 1022 Oryza sativa PRT G188 97 1023Zea mays DNA G188 97 1024 Glycine max DNA G192 101 1025 Oryza sativa PRTG192 101 1026 Glycine max DNA G196 105 1027 Oryza sativa PRT G196 1051028 Oryza sativa PRT G196 105 1029 Oryza sativa PRT G196 105 1030 Zeamays DNA G196 105 1031 Zea mays DNA G196 105 1032 Glycine max DNA G211121 1033 Oryza sativa DNA G211 121 1034 Oryza sativa PRT G211 121 1035Glycine max DNA G214 127 Glycine max DNA G680 463 1036 Glycine max DNAG214 127 Glycine max DNA G680 463 1037 Glycine max DNA G214 127 Glycinemax DNA G680 463 1038 Glycine max DNA G214 127 Glycine max DNA G680 4631039 Oryza sativa DNA G214 127 Oryza sativa DNA G680 463 1040 Oryzasativa DNA G214 127 Oryza sativa DNA G680 463 1041 Zea mays DNA G214 127Zea mays DNA G680 463 1042 Zea mays DNA G214 127 Zea mays DNA G680 4631043 Zea mays DNA G214 127 Zea mays DNA G680 463 1044 Glycine max DNAG226 141 Glycine max DNA G682 467 Glycine max DNA G1816 939 Glycine maxDNA G2718 959 1045 Glycine max DNA G226 141 Glycine max DNA G1816 939Glycine max DNA G2718 959 1046 Glycine max DNA G226 141 Glycine max DNAG682 467 Glycine max DNA G1816 939 Glycine max DNA G2718 959 1047Glycine max DNA G226 141 Glycine max DNA G682 467 Glycine max DNA G1816939 Glycine max DNA G2718 959 1048 Glycine max DNA G226 141 Glycine maxDNA G682 467 Glycine max DNA G1816 939 Glycine max DNA G2718 959 1049Oryza sativa DNA G226 141 Oryza sativa DNA G682 467 Oryza sativa DNAG1816 939 Oryza sativa DNA G2718 959 1050 Oryza sativa PRT G226 141Oryza sativa PRT G682 467 Oryza sativa PRT G1816 939 Oryza sativa PRTG2718 959 1051 Oryza sativa PRT G226 141 Oryza sativa PRT G682 467 Oryzasativa PRT G1816 939 Oryza sativa PRT G2718 959 1052 Zea mays DNA G226141 Zea mays DNA G682 467 Zea mays DNA G1816 939 Zea mays DNA G2718 9591053 Zea mays DNA G226 141 Zea mays DNA G682 467 Zea mays DNA G1816 939Zea mays DNA G2718 959 1054 Glycine max DNA G241 163 1055 Glycine maxDNA G241 163 1056 Glycine max DNA G241 163 1057 Oryza sativa DNA G241163 1058 Zea mays DNA G241 163 1059 Zea mays DNA G241 163 1060 Zea maysDNA G241 163 1061 Zea mays DNA G241 163 1062 Zea mays DNA G241 163 1063Glycine max DNA G254 179 1064 Glycine max DNA G256 183 1065 Glycine maxDNA G256 183 1066 Glycine max DNA G256 183 1067 Glycine max DNA G256 1831068 Glycine max DNA G256 183 1069 Glycine max DNA G256 183 1070 Glycinemax DNA G256 183 1071 Oryza sativa DNA G256 183 1072 Oryza sativa PRTG256 183 1073 Oryza sativa PRT G256 183 1074 Oryza sativa PRT G256 1831075 Oryza sativa PRT G256 183 1076 Oryza sativa PRT G256 183 1077 Zeamays DNA G256 183 1078 Zea mays DNA G256 183 1079 Zea mays DNA G256 1831080 Zea mays DNA G256 183 1081 Zea mays DNA G256 183 1082 Zea mays DNAG256 183 1083 Glycine max DNA G325 223 1084 Zea mays DNA G325 223 1085Glycine max DNA G361 237 1086 Glycine max DNA G361 237 1087 Glycine maxDNA G361 237 1088 Glycine max DNA G361 237 1089 Glycine max DNA G361 2371090 Oryza sativa DNA G361 237 1091 Oryza sativa PRT G361 237 1092 Oryzasativa PRT G361 237 1093 Oryza sativa PRT G361 237 1094 Oryza sativa PRTG361 237 1095 Oryza sativa PRT G361 237 1096 Zea mays DNA G361 237 1097Zea mays DNA G361 237 1098 Glycine max DNA G390 249 Glycine max DNA G438283 1099 Glycine max DNA G390 249 Glycine max DNA G438 283 1100 Glycinemax DNA G390 249 Glycine max DNA G438 283 1101 Glycine max DNA G390 249Glycine max DNA G438 283 1102 Glycine max DNA G390 249 Glycine max DNAG438 283 1103 Glycine max DNA G390 249 Glycine max DNA G438 283 1104Glycine max DNA G390 249 Glycine max DNA G438 283 1105 Glycine max DNAG390 249 1106 Glycine max DNA G390 249 Glycine max DNA G438 283 1107Glycine max DNA G390 249 Glycine max DNA G438 283 1108 Oryza sativa DNAG390 249 1109 Oryza sativa PRT G390 249 Oryza sativa PRT G438 283 1110Oryza sativa PRT G390 249 Oryza sativa PRT G438 283 1111 Oryza sativaPRT G390 249 Oryza sativa PRT G438 283 1112 Oryza sativa PRT G390 249Oryza sativa PRT G438 283 1113 Oryza sativa DNA G390 249 Oryza sativaDNA G438 283 1114 Zea mays DNA G390 249 Zea mays DNA G438 283 1115 Zeamays DNA G390 249 Zea mays DNA G438 283 1116 Zea mays DNA G390 249 Zeamays DNA G438 283 1117 Zea mays DNA G390 249 1118 Zea mays DNA G390 249Zea mays DNA G438 283 1119 Zea mays DNA G390 249 Zea mays DNA G438 2831120 Zea mays DNA G390 249 Zea mays DNA G438 283 1121 Zea mays DNA G390249 Zea mays DNA G438 283 1122 Zea mays DNA G390 249 Zea mays DNA G438283 1123 Zea mays DNA G390 249 Zea mays DNA G438 283 1124 Glycine maxDNA G409 261 1125 Glycine max DNA G409 261 1126 Glycine max DNA G409 2611127 Glycine max DNA G409 261 1128 Glycine max DNA G409 261 1129 Glycinemax DNA G409 261 1130 Glycine max DNA G409 261 1131 Glycine max DNA G409261 1132 Oryza sativa DNA G409 261 1133 Oryza sativa DNA G409 261 1134Oryza sativa DNA G409 261 1135 Zea mays DNA G409 261 1136 Zea mays DNAG409 261 1137 Zea mays DNA G409 261 1138 Zea mays DNA G409 261 1139 Zeamays DNA G409 261 1140 Zea mays DNA G409 261 1141 Zea mays DNA G409 2611142 Glycine max DNA G438 283 1143 Oryza sativa DNA G438 283 1144 Oryzasativa DNA G438 283 1145 Oryza sativa DNA G438 283 1146 Oryza sativa DNAG438 283 1147 Oryza sativa DNA G438 283 1148 Zea mays DNA G438 283 1149Oryza sativa DNA G464 291 1150 Oryza sativa PRT G464 291 1151 Zea maysDNA G464 291 1152 Glycine max DNA G470 295 1153 Oryza sativa DNA G470295 1154 Oryza sativa DNA G470 295 1155 Glycine max DNA G475 301 1156Glycine max DNA G482 305 1157 Glycine max DNA G482 305 1158 Glycine maxDNA G482 305 1159 Glycine max DNA G482 305 1160 Glycine max DNA G482 3051161 Glycine max DNA G482 305 1162 Glycine max DNA G482 305 1163 Glycinemax DNA G482 305 1164 Glycine max DNA G482 305 1165 Oryza sativa PRTG482 305 1166 Oryza sativa PRT G482 305 1167 Oryza sativa PRT G482 3051168 Oryza sativa PRT G482 305 1169 Oryza sativa PRT G482 305 1170 Oryzasativa DNA G482 305 1171 Zea mays DNA G482 305 1172 Zea mays DNA G482305 1173 Zea mays DNA G482 305 1174 Zea mays DNA G482 305 1175 Zea maysDNA G482 305 1176 Zea mays DNA G482 305 1177 Zea mays DNA G482 305 1178Zea mays DNA G482 305 1179 Zea mays DNA G482 305 1180 Glycine max DNAG489 309 1181 Glycine max DNA G489 309 1182 Glycine max DNA G489 3091183 Glycine max DNA G489 309 1184 Glycine max DNA G489 309 1185 Glycinemax DNA G489 309 1186 Glycine max DNA G489 309 1187 Oryza sativa DNAG489 309 1188 Oryza sativa DNA G489 309 1189 Oryza sativa PRT G489 3091190 Oryza sativa PRT G489 309 1191 Oryza sativa PRT G489 309 1192 Zeamays DNA G489 309 1193 Glycine max DNA G509 317 1194 Glycine max DNAG509 317 1195 Glycine max DNA G509 317 1196 Oryza sativa DNA G509 3171197 Oryza sativa PRT G509 317 1198 Oryza sativa PRT G509 317 1199 Oryzasativa PRT G509 317 1200 Oryza sativa DNA G509 317 1201 Zea mays DNAG509 317 1202 Zea mays DNA G509 317 1203 Zea mays DNA G509 317 1204 Zeamays DNA G509 317 1205 Glycine max DNA G545 345 1206 Glycine max DNAG545 345 1207 Glycine max DNA G545 345 1208 Glycine max DNA G545 3451209 Glycine max DNA G545 345 1210 Glycine max DNA G545 345 1211 Glycinemax DNA G545 345 1212 Oryza sativa DNA G545 345 1213 Oryza sativa PRTG545 345 1214 Oryza sativa PRT G545 345 1215 Oryza sativa PRT G545 3451216 Oryza sativa PRT G545 345 1217 Zea mays DNA G545 345 1218 Zea maysDNA G545 345 1219 Zea mays DNA G545 345 1220 Zea mays DNA G561 359 1221Glycine max DNA G562 361 1222 Glycine max DNA G562 361 1223 Glycine maxDNA G562 361 1224 Glycine max DNA G562 361 1225 Glycine max DNA G562 3611226 Oryza sativa PRT G562 361 1227 Oryza sativa PRT G562 361 1228 Zeamays DNA G562 361 1229 Zea mays DNA G562 361 1230 Zea mays DNA G562 3611231 Glycine max DNA G567 369 1232 Oryza sativa DNA G567 369 1233 Oryzasativa PRT G567 369 1234 Glycine max DNA G584 385 1235 Glycine max DNAG584 385 1236 Glycine max DNA G584 385 1237 Glycine max DNA G584 3851238 Glycine max DNA G584 385 1239 Oryza sativa PRT G584 385 1240 Zeamays DNA G584 385 1241 Zea mays DNA G584 385 1242 Zea mays DNA G584 3851243 Glycine max DNA G590 387 1244 Glycine max DNA G590 387 1245 Glycinemax DNA G590 387 1246 Oryza sativa PRT G590 387 1247 Oryza sativa PRTG590 387 1248 Oryza sativa DNA G590 387 1249 Zea mays DNA G590 387 1250Glycine max DNA G592 391 1251 Glycine max DNA G592 391 1252 Glycine maxDNA G592 391 1253 Glycine max DNA G592 391 1254 Glycine max DNA G592 3911255 Oryza sativa DNA G592 391 1256 Oryza sativa DNA G592 391 1257 Oryzasativa DNA G592 391 1258 Oryza sativa PRT G592 391 1259 Oryza sativa PRTG592 391 1260 Oryza sativa DNA G592 391 1261 Zea mays DNA G592 391 1262Zea mays DNA G592 391 1263 Zea mays DNA G592 391 1264 Zea mays DNA G592391 1265 Glycine max DNA G627 405 1266 Glycine max DNA G627 405 1267Oryza sativa DNA G627 405 1268 Oryza sativa DNA G634 415 1269 Oryzasativa PRT G634 415 1270 Oryza sativa DNA G634 415 1271 Oryza sativa DNAG634 415 1272 Zea mays DNA G634 415 1273 Zea mays DNA G634 415 1274 Zeamays DNA G634 415 1275 Glycine max DNA G636 417 1276 Glycine max DNAG636 417 1277 Glycine max DNA G636 417 1278 Glycine max DNA G636 4171279 Glycine max DNA G636 417 1280 Glycine max DNA G636 417 1281 Glycinemax DNA G636 417 1282 Glycine max DNA G636 417 1283 Oryza sativa DNAG636 417 1284 Oryza sativa DNA G636 417 1285 Oryza sativa DNA G636 4171286 Oryza sativa DNA G636 417 1287 Zea mays DNA G636 417 1288 Zea maysDNA G636 417 1289 Zea mays DNA G636 417 1290 Zea mays DNA G636 417 1291Glycine max DNA G638 419 1292 Glycine max DNA G638 419 1293 Glycine maxDNA G638 419 1294 Glycine max DNA G638 419 1295 Glycine max DNA G663 4351296 Glycine max DNA G664 437 1297 Glycine max DNA G664 437 1298 Glycinemax DNA G664 437 1299 Glycine max DNA G664 437 1300 Glycine max DNA G664437 1301 Glycine max DNA G664 437 1302 Glycine max DNA G664 437 1303Oryza sativa DNA G664 437 1304 Oryza sativa DNA G664 437 1305 Oryzasativa DNA G664 437 1306 Oryza sativa DNA G664 437 1307 Oryza sativa PRTG664 437 1308 Oryza sativa PRT G664 437 1309 Oryza sativa PRT G664 4371310 Oryza sativa PRT G664 437 1311 Zea mays DNA G664 437 1312 Zea maysDNA G664 437 1313 Zea mays DNA G664 437 1314 Zea mays DNA G664 437 1315Zea mays DNA G664 437 1316 Zea mays DNA G664 437 1317 Zea mays DNA G664437 1318 Zea mays DNA G664 437 1319 Oryza sativa DNA G680 463 1320 Zeamays DNA G680 463 1321 Glycine max DNA G736 487 1322 Glycine max DNAG736 487 1323 Oryza sativa PRT G736 487 1324 Glycine max DNA G748 4971325 Glycine max DNA G748 497 1326 Glycine max DNA G748 497 1327 Oryzasativa DNA G748 497 1328 Oryza sativa DNA G748 497 1329 Oryza sativa PRTG748 497 1330 Oryza sativa PRT G748 497 1331 Oryza sativa PRT G748 4971332 Oryza sativa PRT G748 497 1333 Zea mays DNA G748 497 1334 Glycinemax DNA G789 539 1335 Glycine max DNA G789 539 1336 Oryza sativa DNAG789 539 1337 Oryza sativa DNA G789 539 1338 Oryza sativa PRT G789 5391339 Oryza sativa PRT G789 539 1340 Oryza sativa PRT G789 539 1341 Zeamays DNA G789 539 1342 Glycine max DNA G801 549 1343 Glycine max DNAG801 549 1344 Zea mays DNA G801 549 1345 Glycine max DNA G849 565 1346Glycine max DNA G849 565 1347 Glycine max DNA G849 565 1348 Glycine maxDNA G849 565 1349 Glycine max DNA G849 565 1350 Glycine max DNA G849 5651351 Zea mays DNA G849 565 1352 Zea mays DNA G849 565 1353 Zea mays DNAG849 565 1354 Glycine max DNA G864 573 1355 Glycine max DNA G864 5731356 Glycine max DNA G864 573 1357 Glycine max DNA G864 573 1358 Glycinemax DNA G864 573 1359 Glycine max DNA G864 573 1360 Oryza sativa DNAG864 573 1361 Oryza sativa PRT G864 573 1362 Oryza sativa PRT G864 5731363 Zea mays DNA G864 573 1364 Zea mays DNA G864 573 1365 Zea mays DNAG864 573 1366 Glycine max DNA G867 579 1367 Glycine max DNA G867 5791368 Glycine max DNA G867 579 1369 Glycine max DNA G867 579 1370 Glycinemax DNA G867 579 1371 Glycine max DNA G867 579 1372 Oryza sativa DNAG867 579 1373 Oryza sativa PRT G867 579 1374 Oryza sativa PRT G867 5791375 Oryza sativa PRT G867 579 1376 Oryza sativa DNA G867 579 1377 Zeamays DNA G867 579 1378 Zea mays DNA G867 579 1379 Zea mays DNA G867 5791380 Zea mays DNA G867 579 1381 Glycine max DNA G869 581 1382 Glycinemax DNA G869 581 1383 Oryza sativa DNA G869 581 1384 Oryza sativa PRTG869 581 1385 Zea mays DNA G869 581 1386 Glycine max DNA G877 583 1387Oryza sativa DNA G877 583 1388 Oryza sativa DNA G877 583 1389 Oryzasativa PRT G877 583 1390 Oryza sativa PRT G877 583 1391 Oryza sativa PRTG877 583 1392 Zea mays DNA G877 583 1393 Zea mays DNA G877 583 1394 Zeamays DNA G877 583 1395 Glycine max DNA G881 587 1396 Oryza sativa PRTG881 587 1397 Oryza sativa DNA G881 587 1398 Oryza sativa DNA G881 5871399 Zea mays DNA G881 587 1400 Zea mays DNA G881 587 1401 Zea mays DNAG881 587 1402 Zea mays DNA G881 587 1403 Glycine max DNA G912 615 1404Glycine max DNA G912 615 1405 Glycine max DNA G912 615 1406 Glycine maxDNA G912 615 1407 Glycine max DNA G912 615 1408 Glycine max DNA G912 6151409 Glycine max DNA G912 615 1410 Oryza sativa DNA G912 615 1411 Oryzasativa PRT G912 615 1412 Oryza sativa PRT G912 615 1413 Oryza sativa PRTG912 615 1414 Oryza sativa PRT G912 615 1415 Oryza sativa DNA G912 6151416 Zea mays DNA G912 615 1417 Zea mays DNA G912 615 1418 Zea mays DNAG912 615 1419 Zea mays DNA G912 615 1420 Zea mays DNA G912 615 1421Glycine max DNA G961 633 1422 Glycine max DNA G961 633 1423 Oryza sativaDNA G961 633 1424 Oryza sativa PRT G961 633 1425 Zea mays DNA G961 6331426 Zea mays DNA G961 633 1427 Zea mays DNA G961 633 1428 Glycine maxDNA G974 641 1429 Glycine max DNA G974 641 1430 Glycine max DNA G974 6411431 Glycine max DNA G974 641 1432 Glycine max DNA G974 641 1433 Glycinemax DNA G974 641 1434 Oryza sativa DNA G974 641 1435 Oryza sativa PRTG974 641 1436 Oryza sativa PRT G974 641 1437 Oryza sativa PRT G974 6411438 Zea mays DNA G974 641 1439 Zea mays DNA G974 641 1440 Zea mays DNAG974 641 1441 Zea mays DNA G974 641 1442 Glycine max DNA G975 643 1443Glycine max DNA G975 643 1444 Glycine max DNA G975 643 1445 Glycine maxDNA G975 643 1446 Glycine max DNA G975 643 1447 Oryza sativa DNA G975643 1448 Oryza sativa PRT G975 643 1449 Oryza sativa DNA G975 643 1450Zea mays DNA G975 643 1451 Zea mays DNA G975 643 1452 Glycine max DNAG979 649 1453 Glycine max DNA G979 649 1454 Glycine max DNA G979 6491455 Oryza sativa DNA G979 649 1456 Oryza sativa PRT G979 649 1457 Oryzasativa PRT G979 649 1458 Oryza sativa PRT G979 649 1459 Oryza sativa PRTG979 649 1460 Oryza sativa PRT G979 649 1461 Zea mays DNA G979 649 1462Zea mays DNA G979 649 1463 Zea mays DNA G979 649 1464 Glycine max DNAG987 653 1465 Glycine max DNA G987 653 1466 Glycine max DNA G987 6531467 Glycine max DNA G987 653 1468 Glycine max DNA G987 653 1469 Glycinemax DNA G987 653 1470 Oryza sativa DNA G987 653 1471 Oryza sativa DNAG987 653 1472 Oryza sativa PRT G987 653 1473 Zea mays DNA G987 653 1474Glycine max DNA G1052 699 1475 Glycine max DNA G1052 699 1476 Glycinemax DNA G1052 699 1477 Glycine max DNA G1052 699 1478 Glycine max DNAG1052 699 1479 Glycine max DNA G1052 699 1480 Glycine max DNA G1052 6991481 Oryza sativa DNA G1052 699 1482 Oryza sativa DNA G1052 699 1483Oryza sativa PRT G1052 699 1484 Oryza sativa PRT G1052 699 1485 Zea maysDNA G1052 699 1486 Zea mays DNA G1052 699 1487 Zea mays DNA G1052 6991488 Zea mays DNA G1052 699 1489 Zea mays DNA G1052 699 1490 Zea maysDNA G1052 699 1491 Zea mays DNA G1052 699 1492 Zea mays DNA G1052 6991493 Zea mays DNA G1052 699 1494 Glycine max DNA G1062 713 1495 Glycinemax DNA G1062 713 1496 Glycine max DNA G1062 713 1497 Glycine max DNAG1062 713 1498 Oryza sativa DNA G1062 713 1499 Oryza sativa DNA G1062713 1500 Oryza sativa PRT G1062 713 1501 Zea mays DNA G1062 713 1502 Zeamays DNA G1062 713 1503 Zea mays DNA G1062 713 1504 Zea mays DNA G1062713 1505 Zea mays DNA G1062 713 1506 Glycine max DNA G1069 721 1507Glycine max DNA G1069 721 1508 Oryza sativa PRT G1069 721 1509 Zea maysDNA G1069 721 1510 Oryza sativa PRT G1073 723 1511 Oryza sativa PRTG1073 723 1512 Glycine max DNA G1075 725 1513 Glycine max DNA G1075 7251514 Glycine max DNA G1075 725 1515 Glycine max DNA G1075 725 1516Glycine max DNA G1075 725 1517 Oryza sativa DNA G1075 725 1518 Oryzasativa DNA G1075 725 1519 Oryza sativa DNA G1075 725 1520 Oryza sativaPRT G1089 731 1521 Oryza sativa DNA G1089 731 1522 Zea mays DNA G1089731 1523 Zea mays DNA G1089 731 1524 Zea mays DNA G1089 731 1525 Zeamays DNA G1089 731 1526 Zea mays DNA G1089 731 1527 Glycine max DNAG1134 741 1528 Glycine max DNA G1134 741 1529 Oryza sativa DNA G1134 7411530 Glycine max DNA G1145 749 1531 Glycine max DNA G1145 749 1532Glycine max DNA G1145 749 1533 Glycine max DNA G1145 749 1534 Glycinemax DNA G1145 749 1535 Glycine max DNA G1145 749 1536 Glycine max DNAG1145 749 1537 Glycine max DNA G1145 749 1538 Oryza sativa PRT G1145 7491539 Oryza sativa PRT G1145 749 1540 Oryza sativa PRT G1145 749 1541Oryza sativa PRT G1145 749 1542 Oryza sativa PRT G1145 749 1543 Oryzasativa PRT G1145 749 1544 Oryza sativa DNA G1145 749 1545 Zea mays DNAG1145 749 1546 Zea mays DNA G1145 749 1547 Zea mays DNA G1145 749 1548Zea mays DNA G1145 749 1549 Zea mays DNA G1145 749 1550 Glycine max DNAG1198 763 1551 Glycine max DNA G1198 763 1552 Glycine max DNA G1198 7631553 Glycine max DNA G1198 763 1554 Glycine max DNA G1198 763 1555Glycine max DNA G1198 763 1556 Glycine max DNA G1198 763 1557 Glycinemax DNA G1198 763 1558 Oryza sativa DNA G1198 763 1559 Oryza sativa DNAG1198 763 1560 Oryza sativa DNA G1198 763 1561 Oryza sativa DNA G1198763 1562 Oryza sativa DNA G1198 763 1563 Oryza sativa PRT G1198 763 1564Oryza sativa PRT G1198 763 1565 Oryza sativa PRT G1198 763 1566 Oryzasativa PRT G1198 763 1567 Oryza sativa PRT G1198 763 1568 Oryza sativaPRT G1198 763 1569 Oryza sativa PRT G1198 763 1570 Zea mays DNA G1198763 1571 Zea mays DNA G1198 763 1572 Zea mays DNA G1198 763 1573 Zeamays DNA G1198 763 1574 Zea mays DNA G1198 763 1575 Zea mays DNA G1198763 1576 Zea mays DNA G1198 763 1577 Zea mays DNA G1198 763 1578 Zeamays DNA G1198 763 1579 Zea mays DNA G1198 763 1580 Glycine max DNAG1242 787 1581 Oryza sativa DNA G1242 787 1582 Oryza sativa PRT G1242787 1583 Oryza sativa PRT G1242 787 1584 Zea mays DNA G1242 787 1585 Zeamays DNA G1242 787 1586 Glycine max DNA G1255 793 1587 Glycine max DNAG1255 793 1588 Glycine max DNA G1255 793 1589 Glycine max DNA G1255 7931590 Glycine max DNA G1255 793 1591 Glycine max DNA G1255 793 1592Glycine max DNA G1255 793 1593 Oryza sativa DNA G1255 793 1594 Oryzasativa PRT G1255 793 1595 Oryza sativa DNA G1255 793 1596 Oryza sativaDNA G1255 793 1597 Oryza sativa DNA G1255 793 1598 Zea mays DNA G1255793 1599 Zea mays DNA G1255 793 1600 Zea mays DNA G1255 793 1601 Zeamays DNA G1255 793 1602 Zea mays DNA G1255 793 1603 Zea mays DNA G1255793 1604 Glycine max DNA G1266 799 1605 Glycine max DNA G1266 799 1606Glycine max DNA G1266 799 1607 Glycine max DNA G1266 799 1608 Oryzasativa DNA G1266 799 1609 Glycine max DNA G1274 805 1610 Glycine max DNAG1274 805 1611 Oryza sativa PRT G1274 805 1612 Oryza sativa PRT G1274805 1613 Zea mays DNA G1274 805 1614 Zea mays DNA G1274 805 1615 Zeamays DNA G1274 805 1616 Zea mays DNA G1274 805 1617 Oryza sativa DNAG1275 807 1618 Oryza sativa PRT G1275 807 1619 Oryza sativa PRT G1275807 1620 Oryza sativa PRT G1275 807 1621 Zea mays DNA G1275 807 1622 Zeamays DNA G1275 807 1623 Zea mays DNA G1275 807 1624 Glycine max DNAG1313 829 1625 Oryza sativa DNA G1313 829 1626 Oryza sativa PRT G1313829 1627 Oryza sativa PRT G1313 829 1628 Zea mays DNA G1313 829 1629 Zeamays DNA G1313 829 1630 Zea mays DNA G1313 829 1631 Glycine max DNAG1322 841 1632 Glycine max DNA G1322 841 1633 Glycine max DNA G1322 8411634 Oryza sativa DNA G1322 841 1635 Oryza sativa PRT G1322 841 1636Oryza sativa PRT G1322 841 1637 Zea mays DNA G1323 843 1638 Zea mays DNAG1323 843 1639 Glycine max DNA G1417 881 1640 Oryza sativa PRT G1417 8811641 Oryza sativa PRT G1417 881 1642 Glycine max DNA G1449 891 1643Glycine max DNA G1449 891 1644 Oryza sativa DNA G1449 891 1645 Oryzasativa DNA G1449 891 1646 Zea mays DNA G1449 891 1647 Zea mays DNA G1449891 1648 Zea mays DNA G1449 891 1649 Zea mays DNA G1449 891 1650 Glycinemax DNA G1451 893 1651 Glycine max DNA G1451 893 1652 Oryza sativa DNAG1451 893 1653 Oryza sativa DNA G1451 893 1654 Oryza sativa DNA G1451893 1655 Oryza sativa PRT G1451 893 1656 Oryza sativa PRT G1451 893 1657Oryza sativa PRT G1451 893 1658 Oryza sativa PRT G1451 893 1659 Zea maysDNA G1451 893 1660 Zea mays DNA G1451 893 1661 Zea mays DNA G1451 8931662 Zea mays DNA G1451 893 1663 Glycine max DNA G1482 905 1664 Glycinemax DNA G1482 905 1665 Glycine max DNA G1482 905 1666 Glycine max DNAG1482 905 1667 Glycine max DNA G1482 905 1668 Oryza sativa DNA G1482 9051669 Oryza sativa DNA G1482 905 1670 Oryza sativa DNA G1482 905 1671Oryza sativa DNA G1482 905 1672 Oryza sativa PRT G1482 905 1673 Oryzasativa PRT G1482 905 1674 Zea mays DNA G1482 905 1675 Zea mays DNA G1482905 1676 Zea mays DNA G1482 905 1677 Zea mays DNA G1482 905 1678 Zeamays DNA G1482 905 1679 Zea mays DNA G1482 905 1680 Oryza sativa PRTG1499 913 1681 Glycine max DNA G1540 919 1682 Oryza sativa PRT G1540 9191683 Glycine max DNA G1560 925 1684 Glycine max DNA G1560 925 1685 Oryzasativa DNA G1560 925 1686 Oryza sativa PRT G1560 925 1687 Oryza sativaPRT G1560 925 1688 Oryza sativa PRT G1560 925 1689 Oryza sativa PRTG1560 925 1690 Zea mays DNA G1560 925 1691 Zea mays DNA G1560 925 1692Zea mays DNA G1560 925 1693 Zea mays DNA G1560 925 1694 Zea mays DNAG1560 925 1695 Zea mays DNA G1560 925 1696 Oryza sativa PRT G1645 9291697 Zea mays DNA G1645 929 1698 Zea mays DNA G1645 929 1699 Zea maysDNA G1645 929 1700 Glycine max DNA G1760 937 1701 Glycine max DNA G1760937 1702 Oryza sativa PRT G1760 937 1703 Oryza sativa PRT G1760 937 1704Zea mays DNA G1760 937 1705 Zea mays DNA G1760 937 1706 Zea mays DNAG1760 937 1707 Oryza sativa DNA G1816 939 1708 Glycine max DNA G2010 951Glycine max DNA G2347 957 1709 Oryza sativa DNA G2010 951 Oryza sativaDNA G2347 957 1710 Zea mays DNA G2010 951 1711 Zea mays DNA G2010 951Zea mays DNA G2347 957 1970 Glycine max DNA G859 567 Glycine max DNAG1842 943 Glycine max DNA G1843 945 Glycine max DNA G1844 947 1971Glycine max PRT G859 568 Glycine max PRT G1842 944 Glycine max PRT G1843946 Glycine max PRT G1844 948 1972 Glycine max DNA G859 567 Glycine maxDNA G1842 943 Glycine max DNA G1843 945 Glycine max DNA G1844 947 1973Glycine max PRT G859 568 Glycine max PRT G1842 944 Glycine max PRT G1843946 Glycine max PRT G1844 948

Table 8 lists a summary of homologous sequences identified using BLAST(tblastx program). The first column shows the polynucleotide sequenceidentifier (SEQ ID NO:), the second column shows the corresponding cDNAidentifier (Gene ID or GID), the third column shows the orthologous orhomologous polynucleotide GenBank Accession Number (Test Sequence ID),the fourth column shows the calculated probability value that thesequence identity is due to chance (Smallest Sum Probability), the fifthcolumn shows the plant species from which the test sequence was isolated(Test Sequence Species), and the sixth column shows the orthologous orhomologous test sequence GenBank annotation (Test Sequence GenBankAnnotation).

Table 9 listed sequences discover to be paralogous to a number oftranscription factors of the present invention. The column headingsinclude, from left to right, the Arabidopsis SEQ ID NO; correspondingArabidopsis Gene ID (GID) numbers; the GID numbers of the paralogsdiscovered in a database search; and the SEQ ID NOs of the paralogs.TABLE 9 Arabidopsis Transcription Factors and Paralogs Paralog SEQ IDSEQ ID NO: GID NO. NO: Paralog GID No. 30 G24 1717 G12 810 G1277 1859G1379 38 G28 670 G1006 54 G46 666 G1004 1863 G1419 1719 G29 50 G43 70G157 1875 G1759 944 G1842 946 G1843 948 G1844 568 G859 106 G196 1739G182 128 G214 464 G680 142 G226 940 G1816 140 G225 960 G2718 468 G682164 G241 156 G233 172 G248 1877 G1785 180 G254 146 G228 184 G256 1803G666 444 G668 628 G932 210 G291 766 G1211 224 G325 1897 G1998 238 G3611895 G1995 1935 G2826 1937 G2838 1767 G362 1769 G370 250 G390 1869 G15481773 G391 1775 G392 284 G438 284 G438 1869 G1548 250 G390 1773 G391 1775G392 292 G464 1781 G463 306 G482 1857 G1364 1911 G2345 1783 G481 1785G485 310 G489 1811 G714 346 G545 1763 G350 1765 G351 386 G584 746 G1136406 G627 1729 G149 436 G663 1853 G1329 1915 G2421 1917 G2422 438 G664108 G197 182 G255 458 G676 124 G212 170 G247 464 G680 G214 468 G682 940G1816 140 G225 142 G226 960 G2718 488 G736 1921 G2432 540 G789 1867G1494 566 G849 398 G610 568 G859 70 G157 1875 G1759 944 G1842 946 G1843948 G1844 574 G864 1873 G1750 1779 G440 580 G867 1893 G1930 14 G9 1829G993 584 G877 1737 G175 588 G881 652 G986 596 G896 1855 G1349 1889 G1887616 G912 1903 G2107 1923 G2513 44 G40 46 G41 48 G42 634 G961 1925 G25351823 G957 640 G971 1819 G914 642 G974 6 G5 644 G975 1861 G1387 1929G2583 650 G979 1901 G2106 1905 G2131 654 G987 1933 G3010 700 G996 1835G1051 714 G1062 934 G1664 722 G1069 1907 G2153 724 G1073 718 G1067 1909G2156 726 G1075 728 G1076 742 G1134 1927 G2555 750 G1145 706 G1056 764G1198 1879 G1806 352 G554 354 G555 1791 G556 356 G558 788 G1242 790G1243 794 G1255 1865 G1484 830 G1313 848 G1325 842 G1322 1747 G221 174G249 844 G1323 432 G659 894 G1451 1827 G990 906 G1482 1891 G1888 928G1634 1932 G2701 930 G1645 1919 G2424 938 G1760 64 G152 66 G153 570 G860940 G1816 140 G225 142 G226 960 G2718 468 G682 944 G1842 70 G157 1875G1759 946 G1843 948 G1844 568 G859 946 G1843 70 G157 1875 G1759 944G1842 948 G1844 568 G859 952 G2010 958 G2347 958 G2347 952 G2010 960G2718 940 G1816 140 G225 142 G226 468 G682

Table 10 lists the gene identification number (GID) and homologousrelationships found using analyses according to Example IX for thesequences of the Sequence Listing. TABLE 10 Homologous relationshipsfound within the Sequence Listing Species from Which DNA or HomologousSEQ ID GID Protein Sequence Relationship of SEQ ID NO: No. (PRT) isDerived NO: to Other Genes 961 DNA Glycine max Predicted polypeptidesequence is orthologous to G8 962 DNA Glycine max Predicted polypeptidesequence is orthologous to G8 963 DNA Glycine max Predicted polypeptidesequence is orthologous to G8 964 DNA Glycine max Predicted polypeptidesequence is orthologous to G8 965 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G8 967 DNA Zea mays Predicted polypeptidesequence is orthologous to G8 968 DNA Zea mays Predicted polypeptidesequence is orthologous to G8 969 DNA Zea mays Predicted polypeptidesequence is orthologous to G8 970 DNA Glycine max Predicted polypeptidesequence is orthologous to G19 971 DNA Glycine max Predicted polypeptidesequence is orthologous to G19 972 DNA Glycine max Predicted polypeptidesequence is orthologous to G19 973 DNA Glycine max Predicted polypeptidesequence is orthologous to G19 974 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G19 975 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G19 976 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G19 977 DNA Zeamays Predicted polypeptide sequence is orthologous to G19 978 DNA Zeamays Predicted polypeptide sequence is orthologous to G19 979 DNAGlycine max Predicted polypeptide sequence is orthologous to G22 980 DNAGlycine max Predicted polypeptide sequence is orthologous to G22 981 DNAGlycine max Predicted polypeptide sequence is orthologous to G24 982 DNAGlycine max Predicted polypeptide sequence is orthologous to G24 983 DNAGlycine max Predicted polypeptide sequence is orthologous to G24 984 DNAGlycine max Predicted polypeptide sequence is orthologous to G24 985 DNAGlycine max Predicted polypeptide sequence is orthologous to G24 986 DNAGlycine max Predicted polypeptide sequence is orthologous to G24 987 DNAGlycine max Predicted polypeptide sequence is orthologous to G24 988 DNAOryza sativa Predicted polypeptide sequence is orthologous to G24 989PRT Oryza sativa Orthologous to G24 990 PRT Oryza sativa Orthologous toG24 991 PRT Oryza sativa Orthologous to G24 992 DNA Zea mays Predictedpolypeptide sequence is orthologous to G24 993 DNA Glycine max Predictedpolypeptide sequence is orthologous to G28 994 DNA Glycine max Predictedpolypeptide sequence is orthologous to G28 995 DNA Glycine max Predictedpolypeptide sequence is orthologous to G28 996 DNA Glycine max Predictedpolypeptide sequence is orthologous to G28 997 DNA Glycine max Predictedpolypeptide sequence is orthologous to G28 998 DNA Glycine max Predictedpolypeptide sequence is orthologous to G28 999 DNA Glycine max Predictedpolypeptide sequence is orthologous to G28 1000 DNA Glycine maxPredicted polypeptide sequence is orthologous to G28 1001 PRT Oryzasativa Orthologous to G28 1002 PRT Oryza sativa Orthologous to G28 1003DNA Zea mays Predicted polypeptide sequence is orthologous to G28 1004DNA Glycine max Predicted polypeptide sequence is orthologous to G461005 DNA Glycine max Predicted polypeptide sequence is orthologous toG46 1006 DNA Glycine max Predicted polypeptide sequence is orthologousto G46 1007 DNA Glycine max Predicted polypeptide sequence isorthologous to G46 1008 DNA Glycine max Predicted polypeptide sequenceis orthologous to G46 1009 DNA Glycine max Predicted polypeptidesequence is orthologous to G46 1010 DNA Glycine max Predictedpolypeptide sequence is orthologous to G46 1011 DNA Glycine maxPredicted polypeptide sequence is orthologous to G46 1012 PRT Oryzasativa Orthologous to G46 1013 DNA Zea mays Predicted polypeptidesequence is orthologous to G46 1014 DNA Glycine max Predictedpolypeptide sequence is orthologous to G157, G859, G1842, G1843 1015 DNAGlycine max Predicted polypeptide sequence is orthologous to G180 1016DNA Glycine max Predicted polypeptide sequence is orthologous to G1801017 DNA Oryza sativa Predicted polypeptide sequence is orthologous toG180 1018 PRT Oryza sativa Orthologous to G180 1019 DNA Zea maysPredicted polypeptide sequence is orthologous to G180 1020 DNA Glycinemax Predicted polypeptide sequence is orthologous to G188 1021 PRT Oryzasativa Orthologous to G188 1022 PRT Oryza sativa Orthologous to G1881023 DNA Zea mays Predicted polypeptide sequence is orthologous to G1881024 DNA Glycine max Predicted polypeptide sequence is orthologous toG192 1025 PRT Oryza sativa Orthologous to G192 1026 DNA Glycine maxPredicted polypeptide sequence is orthologous to G196 1027 PRT Oryzasativa Orthologous to G196 1028 PRT Oryza sativa Orthologous to G1961029 PRT Oryza sativa Orthologous to G196 1030 DNA Zea mays Predictedpolypeptide sequence is orthologous to G196 1031 DNA Zea mays Predictedpolypeptide sequence is orthologous to G196 1032 DNA Glycine maxPredicted polypeptide sequence is orthologous to G211 1033 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G211 1034 PRTOryza sativa Orthologous to G211 1035 DNA Glycine max Predictedpolypeptide sequence is orthologous to G214, G680 1036 DNA Glycine maxPredicted polypeptide sequence is orthologous to G214, G680 1037 DNAGlycine max Predicted polypeptide sequence is orthologous to G214, G6801038 DNA Glycine max Predicted polypeptide sequence is orthologous toG214, G680 1039 DNA Oryza sativa Predicted polypeptide sequence isorthologous to G214, G680 1040 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G214, G680 1041 DNA Zea mays Predictedpolypeptide sequence is orthologous to G214, G680 1042 DNA Zea maysPredicted polypeptide sequence is orthologous to G214, G680 1043 DNA Zeamays Predicted polypeptide sequence is orthologous to G214, G680 1044DNA Glycine max Predicted polypeptide sequence is orthologous to G226,G682, G1816, G2718 1045 DNA Glycine max Predicted polypeptide sequenceis orthologous to G226, G1816, G2718 1046 DNA Glycine max Predictedpolypeptide sequence is orthologous to G226, G682, G1816, G2718 1047 DNAGlycine max Predicted polypeptide sequence is orthologous to G226, G682,G1816, G2718 1048 DNA Glycine max Predicted polypeptide sequence isorthologous to G226, G682, G1816, G2718 1049 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G226, G682, G1816, G2718 1050 PRTOryza sativa Orthologous to G226, G682, G1816, G2718 1051 PRT Oryzasativa Orthologous to G226, G682, G1816, G2718 1052 DNA Zea maysPredicted polypeptide sequence is orthologous to G226, G682, G1816,G2718 1053 DNA Zea mays Predicted polypeptide sequence is orthologous toG226, G682, G1816, G2718 1054 DNA Glycine max Predicted polypeptidesequence is orthologous to G241 1055 DNA Glycine max Predictedpolypeptide sequence is orthologous to G241 1056 DNA Glycine maxPredicted polypeptide sequence is orthologous to G241 1057 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G241 1058 DNAZea mays Predicted polypeptide sequence is orthologous to G241 1059 DNAZea mays Predicted polypeptide sequence is orthologous to G241 1060 DNAZea mays Predicted polypeptide sequence is orthologous to G241 1061 DNAZea mays Predicted polypeptide sequence is orthologous to G241 1062 DNAZea mays Predicted polypeptide sequence is orthologous to G241 1063 DNAGlycine max Predicted polypeptide sequence is orthologous to G254 1064DNA Glycine max Predicted polypeptide sequence is orthologous to G2561065 DNA Glycine max Predicted polypeptide sequence is orthologous toG256 1066 DNA Glycine max Predicted polypeptide sequence is orthologousto G256 1067 DNA Glycine max Predicted polypeptide sequence isorthologous to G256 1068 DNA Glycine max Predicted polypeptide sequenceis orthologous to G256 1069 DNA Glycine max Predicted polypeptidesequence is orthologous to G256 1070 DNA Glycine max Predictedpolypeptide sequence is orthologous to G256 1071 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G256 1072 PRT Oryzasativa Orthologous to G256 1073 PRT Oryza sativa Orthologous to G2561074 PRT Oryza sativa Orthologous to G256 1075 PRT Oryza sativaOrthologous to G256 1076 PRT Oryza sativa Orthologous to G256 1077 DNAZea mays Predicted polypeptide sequence is orthologous to G256 1078 DNAZea mays Predicted polypeptide sequence is orthologous to G256 1079 DNAZea mays Predicted polypeptide sequence is orthologous to G256 1080 DNAZea mays Predicted polypeptide sequence is orthologous to G256 1081 DNAZea mays Predicted polypeptide sequence is orthologous to G256 1082 DNAZea mays Predicted polypeptide sequence is orthologous to G256 1083 DNAGlycine max Predicted polypeptide sequence is orthologous to G325 1084DNA Zea mays Predicted polypeptide sequence is orthologous to G325 1085DNA Glycine max Predicted polypeptide sequence is orthologous to G3611086 DNA Glycine max Predicted polypeptide sequence is orthologous toG361 1087 DNA Glycine max Predicted polypeptide sequence is orthologousto G361 1088 DNA Glycine max Predicted polypeptide sequence isorthologous to G361 1089 DNA Glycine max Predicted polypeptide sequenceis orthologous to G361 1090 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G361 1091 PRT Oryza sativa Orthologous toG361 1092 PRT Oryza sativa Orthologous to G361 1093 PRT Oryza sativaOrthologous to G361 1094 PRT Oryza sativa Orthologous to G361 1095 PRTOryza sativa Orthologous to G361 1096 DNA Zea mays Predicted polypeptidesequence is orthologous to G361 1097 DNA Zea mays Predicted polypeptidesequence is orthologous to G361 1098 DNA Glycine max Predictedpolypeptide sequence is orthologous to G390, G438 1099 DNA Glycine maxPredicted polypeptide sequence is orthologous to G390, G438 1100 DNAGlycine max Predicted polypeptide sequence is orthologous to G390, G4381101 DNA Glycine max Predicted polypeptide sequence is orthologous toG390, G438 1102 DNA Glycine max Predicted polypeptide sequence isorthologous to G390, G438 1103 DNA Glycine max Predicted polypeptidesequence is orthologous to G390, G438 1104 DNA Glycine max Predictedpolypeptide sequence is orthologous to G390, G438 1105 DNA Glycine maxPredicted polypeptide sequence is orthologous to G390 1106 DNA Glycinemax Predicted polypeptide sequence is orthologous to G390, G438 1107 DNAGlycine max Predicted polypeptide sequence is orthologous to G390, G4381108 DNA Oryza sativa Predicted polypeptide sequence is orthologous toG390 1109 PRT Oryza sativa Orthologous to G390, G438 1110 PRT Oryzasativa Orthologous to G390, G438 1111 PRT Oryza sativa Orthologous toG390, G438 1112 PRT Oryza sativa Orthologous to G390, G438 1113 DNAOryza sativa Predicted polypeptide sequence is orthologous to G390, G4381114 DNA Zea mays Predicted polypeptide sequence is orthologous to G390,G438 1115 DNA Zea mays Predicted polypeptide sequence is orthologous toG390, G438 1116 DNA Zea mays Predicted polypeptide sequence isorthologous to G390, G438 1117 DNA Zea mays Predicted polypeptidesequence is orthologous to G390 1118 DNA Zea mays Predicted polypeptidesequence is orthologous to G390, G438 1119 DNA Zea mays Predictedpolypeptide sequence is orthologous to G390, G438 1120 DNA Zea maysPredicted polypeptide sequence is orthologous to G390, G438 1121 DNA Zeamays Predicted polypeptide sequence is orthologous to G390, G438 1122DNA Zea mays Predicted polypeptide sequence is orthologous to G390, G4381123 DNA Zea mays Predicted polypeptide sequence is orthologous to G390,G438 1124 DNA Glycine max Predicted polypeptide sequence is orthologousto G409 1125 DNA Glycine max Predicted polypeptide sequence isorthologous to G409 1126 DNA Glycine max Predicted polypeptide sequenceis orthologous to G409 1127 DNA Glycine max Predicted polypeptidesequence is orthologous to G409 1128 DNA Glycine max Predictedpolypeptide sequence is orthologous to G409 1129 DNA Glycine maxPredicted polypeptide sequence is orthologous to G409 1130 DNA Glycinemax Predicted polypeptide sequence is orthologous to G409 1131 DNAGlycine max Predicted polypeptide sequence is orthologous to G409 1132DNA Oryza sativa Predicted polypeptide sequence is orthologous to G4091133 DNA Oryza sativa Predicted polypeptide sequence is orthologous toG409 1134 DNA Oryza sativa Predicted polypeptide sequence is orthologousto G409 1135 DNA Zea mays Predicted polypeptide sequence is orthologousto G409 1136 DNA Zea mays Predicted polypeptide sequence is orthologousto G409 1137 DNA Zea mays Predicted polypeptide sequence is orthologousto G409 1138 DNA Zea mays Predicted polypeptide sequence is orthologousto G409 1139 DNA Zea mays Predicted polypeptide sequence is orthologousto G409 1140 DNA Zea mays Predicted polypeptide sequence is orthologousto G409 1141 DNA Zea mays Predicted polypeptide sequence is orthologousto G409 1142 DNA Glycine max Predicted polypeptide sequence isorthologous to G438 1143 DNA Oryza sativa Predicted polypeptide sequenceis orthologous to G438 1144 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G438 1145 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G438 1146 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G438 1147 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G438 1148 DNAZea mays Predicted polypeptide sequence is orthologous to G438 1149 DNAOryza sativa Predicted polypeptide sequence is orthologous to G464 1150PRT Oryza sativa Orthologous to G464 1151 DNA Zea mays Predictedpolypeptide sequence is orthologous to G464 1152 DNA Glycine maxPredicted polypeptide sequence is orthologous to G470 1153 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G470 1154 DNAOryza sativa Predicted polypeptide sequence is orthologous to G470 1155DNA Glycine max Predicted polypeptide sequence is orthologous to G4751156 DNA Glycine max Predicted polypeptide sequence is orthologous toG482 1157 DNA Glycine max Predicted polypeptide sequence is orthologousto G482 1158 DNA Glycine max Predicted polypeptide sequence isorthologous to G482 1159 DNA Glycine max Predicted polypeptide sequenceis orthologous to G482 1160 DNA Glycine max Predicted polypeptidesequence is orthologous to G482 1161 DNA Glycine max Predictedpolypeptide sequence is orthologous to G482 1162 DNA Glycine maxPredicted polypeptide sequence is orthologous to G482 1163 DNA Glycinemax Predicted polypeptide sequence is orthologous to G482 1164 DNAGlycine max Predicted polypeptide sequence is orthologous to G482 1165PRT Oryza sativa Orthologous to G482 1166 PRT Oryza sativa Orthologousto G482 1167 PRT Oryza sativa Orthologous to G482 1168 PRT Oryza sativaOrthologous to G482 1169 PRT Oryza sativa Orthologous to G482 1170 DNAOryza sativa Predicted polypeptide sequence is orthologous to G482 1171DNA Zea mays Predicted polypeptide sequence is orthologous to G482 1172DNA Zea mays Predicted polypeptide sequence is orthologous to G482 1173DNA Zea mays Predicted polypeptide sequence is orthologous to G482 1174DNA Zea mays Predicted polypeptide sequence is orthologous to G482 1175DNA Zea mays Predicted polypeptide sequence is orthologous to G482 1176DNA Zea mays Predicted polypeptide sequence is orthologous to G482 1177DNA Zea mays Predicted polypeptide sequence is orthologous to G482 1178DNA Zea mays Predicted polypeptide sequence is orthologous to G482 1179DNA Zea mays Predicted polypeptide sequence is orthologous to G482 1180DNA Glycine max Predicted polypeptide sequence is orthologous to G4891181 DNA Glycine max Predicted polypeptide sequence is orthologous toG489 1182 DNA Glycine max Predicted polypeptide sequence is orthologousto G489 1183 DNA Glycine max Predicted polypeptide sequence isorthologous to G489 1184 DNA Glycine max Predicted polypeptide sequenceis orthologous to G489 1185 DNA Glycine max Predicted polypeptidesequence is orthologous to G489 1186 DNA Glycine max Predictedpolypeptide sequence is orthologous to G489 1187 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G489 1188 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G489 1189 PRTOryza sativa Orthologous to G489 1190 PRT Oryza sativa Orthologous toG489 1191 PRT Oryza sativa Orthologous to G489 1192 DNA Zea maysPredicted polypeptide sequence is orthologous to G489 1193 DNA Glycinemax Predicted polypeptide sequence is orthologous to G509 1194 DNAGlycine max Predicted polypeptide sequence is orthologous to G509 1195DNA Glycine max Predicted polypeptide sequence is orthologous to G5091196 DNA Oryza sativa Predicted polypeptide sequence is orthologous toG509 1197 PRT Oryza sativa Orthologous to G509 1198 PRT Oryza sativaOrthologous to G509 1199 PRT Oryza sativa Orthologous to G509 1200 DNAOryza sativa Predicted polypeptide sequence is orthologous to G509 1201DNA Zea mays Predicted polypeptide sequence is orthologous to G509 1202DNA Zea mays Predicted polypeptide sequence is orthologous to G509 1203DNA Zea mays Predicted polypeptide sequence is orthologous to G509 1204DNA Zea mays Predicted polypeptide sequence is orthologous to G509 1205DNA Glycine max Predicted polypeptide sequence is orthologous to G5451206 DNA Glycine max Predicted polypeptide sequence is orthologous toG545 1207 DNA Glycine max Predicted polypeptide sequence is orthologousto G545 1208 DNA Glycine max Predicted polypeptide sequence isorthologous to G545 1209 DNA Glycine max Predicted polypeptide sequenceis orthologous to G545 1210 DNA Glycine max Predicted polypeptidesequence is orthologous to G545 1211 DNA Glycine max Predictedpolypeptide sequence is orthologous to G545 1212 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G545 1213 PRT Oryzasativa Orthologous to G545 1214 PRT Oryza sativa Orthologous to G5451215 PRT Oryza sativa Orthologous to G545 1216 PRT Oryza sativaOrthologous to G545 1217 DNA Zea mays Predicted polypeptide sequence isorthologous to G545 1218 DNA Zea mays Predicted polypeptide sequence isorthologous to G545 1219 DNA Zea mays Predicted polypeptide sequence isorthologous to G545 1220 DNA Zea mays Predicted polypeptide sequence isorthologous to G561 1221 DNA Glycine max Predicted polypeptide sequenceis orthologous to G562 1222 DNA Glycine max Predicted polypeptidesequence is orthologous to G562 1223 DNA Glycine max Predictedpolypeptide sequence is orthologous to G562 1224 DNA Glycine maxPredicted polypeptide sequence is orthologous to G562 1225 DNA Glycinemax Predicted polypeptide sequence is orthologous to G562 1226 PRT Oryzasativa Orthologous to G562 1227 PRT Oryza sativa Orthologous to G5621228 DNA Zea mays Predicted polypeptide sequence is orthologous to G5621229 DNA Zea mays Predicted polypeptide sequence is orthologous to G5621230 DNA Zea mays Predicted polypeptide sequence is orthologous to G5621231 DNA Glycine max Predicted polypeptide sequence is orthologous toG567 1232 DNA Oryza sativa Predicted polypeptide sequence is orthologousto G567 1233 PRT Oryza sativa Orthologous to G567 1234 DNA Glycine maxPredicted polypeptide sequence is orthologous to G584 1235 DNA Glycinemax Predicted polypeptide sequence is orthologous to G584 1236 DNAGlycine max Predicted polypeptide sequence is orthologous to G584 1237DNA Glycine max Predicted polypeptide sequence is orthologous to G5841238 DNA Glycine max Predicted polypeptide sequence is orthologous toG584 1239 PRT Oryza sativa Orthologous to G584 1240 DNA Zea maysPredicted polypeptide sequence is orthologous to G584 1241 DNA Zea maysPredicted polypeptide sequence is orthologous to G584 1242 DNA Zea maysPredicted polypeptide sequence is orthologous to G584 1243 DNA Glycinemax Predicted polypeptide sequence is orthologous to G590 1244 DNAGlycine max Predicted polypeptide sequence is orthologous to G590 1245DNA Glycine max Predicted polypeptide sequence is orthologous to G5901246 PRT Oryza sativa Orthologous to G590 1247 PRT Oryza sativaOrthologous to G590 1248 DNA Oryza sativa Predicted polypeptide sequenceis orthologous to G590 1249 DNA Zea mays Predicted polypeptide sequenceis orthologous to G590 1250 DNA Glycine max Predicted polypeptidesequence is orthologous to G592 1251 DNA Glycine max Predictedpolypeptide sequence is orthologous to G592 1252 DNA Glycine maxPredicted polypeptide sequence is orthologous to G592 1253 DNA Glycinemax Predicted polypeptide sequence is orthologous to G592 1254 DNAGlycine max Predicted polypeptide sequence is orthologous to G592 1255DNA Oryza sativa Predicted polypeptide sequence is orthologous to G5921256 DNA Oryza sativa Predicted polypeptide sequence is orthologous toG592 1257 DNA Oryza sativa Predicted polypeptide sequence is orthologousto G592 1258 PRT Oryza sativa Orthologous to G592 1259 PRT Oryza sativaOrthologous to G592 1260 DNA Oryza sativa Predicted polypeptide sequenceis orthologous to G592 1261 DNA Zea mays Predicted polypeptide sequenceis orthologous to G592 1262 DNA Zea mays Predicted polypeptide sequenceis orthologous to G592 1263 DNA Zea mays Predicted polypeptide sequenceis orthologous to G592 1264 DNA Zea mays Predicted polypeptide sequenceis orthologous to G592 1265 DNA Glycine max Predicted polypeptidesequence is orthologous to G627 1266 DNA GLYCINE MAX Predictedpolypeptide sequence is orthologous to G627 1267 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G627 1268 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G634 1269 PRTOryza sativa Orthologous to G634 1270 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G634 1271 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G634 1272 DNA Zea maysPredicted polypeptide sequence is orthologous to G634 1273 DNA Zea maysPredicted polypeptide sequence is orthologous to G634 1274 DNA Zea maysPredicted polypeptide sequence is orthologous to G634 1275 DNA Glycinemax Predicted polypeptide sequence is orthologous to G636 1276 DNAGlycine max Predicted polypeptide sequence is orthologous to G636 1277DNA Glycine max Predicted polypeptide sequence is orthologous to G6361278 DNA Glycine max Predicted polypeptide sequence is orthologous toG636 1279 DNA Glycine max Predicted polypeptide sequence is orthologousto G636 1280 DNA Glycine max Predicted polypeptide sequence isorthologous to G636 1281 DNA Glycine max Predicted polypeptide sequenceis orthologous to G636 1282 DNA Glycine max Predicted polypeptidesequence is orthologous to G636 1283 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G636 1284 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G636 1285 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G636 1286 DNAOryza sativa Predicted polypeptide sequence is orthologous to G636 1287DNA Zea mays Predicted polypeptide sequence is orthologous to G636 1288DNA Zea mays Predicted polypeptide sequence is orthologous to G636 1289DNA Zea mays Predicted polypeptide sequence is orthologous to G636 1290DNA Zea mays Predicted polypeptide sequence is orthologous to G636 1291DNA Glycine max Predicted polypeptide sequence is orthologous to G6381292 DNA Glycine max Predicted polypeptide sequence is orthologous toG638 1293 DNA Glycine max Predicted polypeptide sequence is orthologousto G638 1294 DNA Glycine max Predicted polypeptide sequence isorthologous to G638 1295 DNA Glycine max Predicted polypeptide sequenceis orthologous to G663 1296 DNA Glycine max Predicted polypeptidesequence is orthologous to G664 1297 DNA Glycine max Predictedpolypeptide sequence is orthologous to G664 1298 DNA Glycine maxPredicted polypeptide sequence is orthologous to G664 1299 DNA Glycinemax Predicted polypeptide sequence is orthologous to G664 1300 DNAGlycine max Predicted polypeptide sequence is orthologous to G664 1301DNA Glycine max Predicted polypeptide sequence is orthologous to G6641302 DNA Glycine max Predicted polypeptide sequence is orthologous toG664 1303 DNA Oryza sativa Predicted polypeptide sequence is orthologousto G664 1304 DNA Oryza sativa Predicted polypeptide sequence isorthologous to G664 1305 DNA Oryza sativa Predicted polypeptide sequenceis orthologous to G664 1306 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G664 1307 PRT Oryza sativa Orthologous toG664 1308 PRT Oryza sativa Orthologous to G664 1309 PRT Oryza sativaOrthologous to G664 1310 PRT Oryza sativa Orthologous to G664 1311 DNAZea mays Predicted polypeptide sequence is orthologous to G664 1312 DNAZea mays Predicted polypeptide sequence is orthologous to G664 1313 DNAZea mays Predicted polypeptide sequence is orthologous to G664 1314 DNAZea mays Predicted polypeptide sequence is orthologous to G664 1315 DNAZea mays Predicted polypeptide sequence is orthologous to G664 1316 DNAZea mays Predicted polypeptide sequence is orthologous to G664 1317 DNAZea mays Predicted polypeptide sequence is orthologous to G664 1318 DNAZea mays Predicted polypeptide sequence is orthologous to G664 1319 DNAOryza sativa Predicted polypeptide sequence is orthologous to G680 1320DNA Zea mays Predicted polypeptide sequence is orthologous to G680 1321DNA Glycine max Predicted polypeptide sequence is orthologous to G7361322 DNA Glycine max Predicted polypeptide sequence is orthologous toG736 1323 PRT Oryza sativa Orthologous to G736 1324 DNA Glycine maxPredicted polypeptide sequence is orthologous to G748 1325 DNA Glycinemax Predicted polypeptide sequence is orthologous to G748 1326 DNAGlycine max Predicted polypeptide sequence is orthologous to G748 1327DNA Oryza sativa Predicted polypeptide sequence is orthologous to G7481328 DNA Oryza sativa Predicted polypeptide sequence is orthologous toG748 1329 PRT Oryza sativa Orthologous to G748 1330 PRT Oryza sativaOrthologous to G748 1331 PRT Oryza sativa Orthologous to G748 1332 PRTOryza sativa Orthologous to G748 1333 DNA Zea mays Predicted polypeptidesequence is orthologous to G748 1334 DNA Glycine max Predictedpolypeptide sequence is orthologous to G789 1335 DNA Glycine maxPredicted polypeptide sequence is orthologous to G789 1336 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G789 1337 DNAOryza sativa Predicted polypeptide sequence is orthologous to G789 1338PRT Oryza sativa Orthologous to G789 1339 PRT Oryza sativa Orthologousto G789 1340 PRT Oryza sativa Orthologous to G789 1341 DNA Zea maysPredicted polypeptide sequence is orthologous to G789 1342 DNA Glycinemax Predicted polypeptide sequence is orthologous to G801 1343 DNAGlycine max Predicted polypeptide sequence is orthologous to G801 1344DNA Zea mays Predicted polypeptide sequence is orthologous to G801 1345DNA Glycine max Predicted polypeptide sequence is orthologous to G8491346 DNA Glycine max Predicted polypeptide sequence is orthologous toG849 1347 DNA Glycine max Predicted polypeptide sequence is orthologousto G849 1348 DNA Glycine max Predicted polypeptide sequence isorthologous to G849 1349 DNA Glycine max Predicted polypeptide sequenceis orthologous to G849 1350 DNA Glycine max Predicted polypeptidesequence is orthologous to G849 1351 DNA Zea mays Predicted polypeptidesequence is orthologous to G849 1352 DNA Zea mays Predicted polypeptidesequence is orthologous to G849 1353 DNA Zea mays Predicted polypeptidesequence is orthologous to G849 1354 DNA Glycine max Predictedpolypeptide sequence is orthologous to G864 1355 DNA Glycine maxPredicted polypeptide sequence is orthologous to G864 1356 DNA Glycinemax Predicted polypeptide sequence is orthologous to G864 1357 DNAGlycine max Predicted polypeptide sequence is orthologous to G864 1358DNA Glycine max Predicted polypeptide sequence is orthologous to G8641359 DNA Glycine max Predicted polypeptide sequence is orthologous toG864 1360 DNA Oryza sativa Predicted polypeptide sequence is orthologousto G864 1361 PRT Oryza sativa Orthologous to G864 1362 PRT Oryza sativaOrthologous to G864 1363 DNA Zea mays Predicted polypeptide sequence isorthologous to G864 1364 DNA Zea mays Predicted polypeptide sequence isorthologous to G864 1365 DNA Zea mays Predicted polypeptide sequence isorthologous to G864 1366 DNA Glycine max Predicted polypeptide sequenceis orthologous to G867 1367 DNA Glycine max Predicted polypeptidesequence is orthologous to G867 1368 DNA Glycine max Predictedpolypeptide sequence is orthologous to G867 1369 DNA Glycine maxPredicted polypeptide sequence is orthologous to G867 1370 DNA Glycinemax Predicted polypeptide sequence is orthologous to G867 1371 DNAGlycine max Predicted polypeptide sequence is orthologous to G867 1372DNA Oryza sativa Predicted polypeptide sequence is orthologous to G8671373 PRT Oryza sativa Orthologous to G867 1374 PRT Oryza sativaOrthologous to G867 1375 PRT Oryza sativa Orthologous to G867 1376 DNAOryza sativa Predicted polypeptide sequence is orthologous to G867 1377DNA Zea mays Predicted polypeptide sequence is orthologous to G867 1378DNA Zea mays Predicted polypeptide sequence is orthologous to G867 1379DNA Zea mays Predicted polypeptide sequence is orthologous to G867 1380DNA Zea mays Predicted polypeptide sequence is orthologous to G867 1381DNA Glycine max Predicted polypeptide sequence is orthologous to G8691382 DNA Glycine max Predicted polypeptide sequence is orthologous toG869 1383 DNA Oryza sativa Predicted polypeptide sequence is orthologousto G869 1384 PRT Oryza sativa Orthologous to G869 1385 DNA Zea maysPredicted polypeptide sequence is orthologous to G869 1386 DNA Glycinemax Predicted polypeptide sequence is orthologous to G877 1387 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G877 1388 DNAOryza sativa Predicted polypeptide sequence is orthologous to G877 1389PRT Oryza sativa Orthologous to G877 1390 PRT Oryza sativa Orthologousto G877 1391 PRT Oryza sativa Orthologous to G877 1392 DNA Zea maysPredicted polypeptide sequence is orthologous to G877 1393 DNA Zea maysPredicted polypeptide sequence is orthologous to G877 1394 DNA Zea maysPredicted polypeptide sequence is orthologous to G877 1395 DNA Glycinemax Predicted polypeptide sequence is orthologous to G881 1396 PRT Oryzasativa Orthologous to G881 1397 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G881 1398 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G881 1399 DNA Zea mays Predictedpolypeptide sequence is orthologous to G881 1400 DNA Zea mays Predictedpolypeptide sequence is orthologous to G881 1401 DNA Zea mays Predictedpolypeptide sequence is orthologous to G881 1402 DNA Zea mays Predictedpolypeptide sequence is orthologous to G881 1403 DNA Glycine maxPredicted polypeptide sequence is orthologous to G912 1404 DNA Glycinemax Predicted polypeptide sequence is orthologous to G912 1405 DNAGlycine max Predicted polypeptide sequence is orthologous to G912 1406DNA Glycine max Predicted polypeptide sequence is orthologous to G9121407 DNA Glycine max Predicted polypeptide sequence is orthologous toG912 1408 DNA Glycine max Predicted polypeptide sequence is orthologousto G912 1409 DNA Glycine max Predicted polypeptide sequence isorthologous to G912 1410 DNA Oryza sativa Predicted polypeptide sequenceis orthologous to G912 1411 PRT Oryza sativa Orthologous to G912 1412PRT Oryza sativa Orthologous to G912 1413 PRT Oryza sativa Orthologousto G912 1414 PRT Oryza sativa Orthologous to G912 1415 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G912 1416 DNA Zea maysPredicted polypeptide sequence is orthologous to G912 1417 DNA Zea maysPredicted polypeptide sequence is orthologous to G912 1418 DNA Zea maysPredicted polypeptide sequence is orthologous to G912 1419 DNA Zea maysPredicted polypeptide sequence is orthologous to G912 1420 DNA Zea maysPredicted polypeptide sequence is orthologous to G912 1421 DNA Glycinemax Predicted polypeptide sequence is orthologous to G961 1422 DNAGlycine max Predicted polypeptide sequence is orthologous to G961 1423DNA Oryza sativa Predicted polypeptide sequence is orthologous to G9611424 PRT Oryza sativa Orthologous to G961 1425 DNA Zea mays Predictedpolypeptide sequence is orthologous to G961 1426 DNA Zea mays Predictedpolypeptide sequence is orthologous to G961 1427 DNA Zea mays Predictedpolypeptide sequence is orthologous to G961 1428 DNA Glycine maxPredicted polypeptide sequence is orthologous to G974 1429 DNA Glycinemax Predicted polypeptide sequence is orthologous to G974 1430 DNAGlycine max Predicted polypeptide sequence is orthologous to G974 1431DNA Glycine max Predicted polypeptide sequence is orthologous to G9741432 DNA Glycine max Predicted polypeptide sequence is orthologous toG974 1433 DNA Glycine max Predicted polypeptide sequence is orthologousto G974 1434 DNA Oryza sativa Predicted polypeptide sequence isorthologous to G974 1435 PRT Oryza sativa Orthologous to G974 1436 PRTOryza sativa Orthologous to G974 1437 PRT Oryza sativa Orthologous toG974 1438 DNA Zea mays Predicted polypeptide sequence is orthologous toG974 1439 DNA Zea mays Predicted polypeptide sequence is orthologous toG974 1440 DNA Zea mays Predicted polypeptide sequence is orthologous toG974 1441 DNA Zea mays Predicted polypeptide sequence is orthologous toG974 1442 DNA Glycine max Predicted polypeptide sequence is orthologousto G975 1443 DNA Glycine max Predicted polypeptide sequence isorthologous to G975 1444 DNA Glycine max Predicted polypeptide sequenceis orthologous to G975 1445 DNA Glycine max Predicted polypeptidesequence is orthologous to G975 1446 DNA Glycine max Predictedpolypeptide sequence is orthologous to G975 1447 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G975 1448 PRT Oryzasativa Orthologous to G975 1449 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G975 1450 DNA Zea mays Predicted polypeptidesequence is orthologous to G975 1451 DNA Zea mays Predicted polypeptidesequence is orthologous to G975 1452 DNA Glycine max Predictedpolypeptide sequence is orthologous to G979 1453 DNA Glycine maxPredicted polypeptide sequence is orthologous to G979 1454 DNA Glycinemax Predicted polypeptide sequence is orthologous to G979 1455 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G979 1456 PRTOryza sativa Orthologous to G979 1457 PRT Oryza sativa Orthologous toG979 1458 PRT Oryza sativa Orthologous to G979 1459 PRT Oryza sativaOrthologous to G979 1460 PRT Oryza sativa Orthologous to G979 1461 DNAZea mays Predicted polypeptide sequence is orthologous to G979 1462 DNAZea mays Predicted polypeptide sequence is orthologous to G979 1463 DNAZea mays Predicted polypeptide sequence is orthologous to G979 1464 DNAGlycine max Predicted polypeptide sequence is orthologous to G987 1465DNA Glycine max Predicted polypeptide sequence is orthologous to G9871466 DNA Glycine max Predicted polypeptide sequence is orthologous toG987 1467 DNA Glycine max Predicted polypeptide sequence is orthologousto G987 1468 DNA Glycine max Predicted polypeptide sequence isorthologous to G987 1469 DNA Glycine max Predicted polypeptide sequenceis orthologous to G987 1470 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G987 1471 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G987 1472 PRT Oryza sativaOrthologous to G987 1473 DNA Zea mays Predicted polypeptide sequence isorthologous to G987 1474 DNA Glycine max Predicted polypeptide sequenceis orthologous to G1052 1475 DNA Glycine max Predicted polypeptidesequence is orthologous to G1052 1476 DNA Glycine max Predictedpolypeptide sequence is orthologous to G1052 1477 DNA Glycine maxPredicted polypeptide sequence is orthologous to G1052 1478 DNA Glycinemax Predicted polypeptide sequence is orthologous to G1052 1479 DNAGlycine max Predicted polypeptide sequence is orthologous to G1052 1480DNA Glycine max Predicted polypeptide sequence is orthologous to G10521481 DNA Oryza sativa Predicted polypeptide sequence is orthologous toG1052 1482 DNA Oryza sativa Predicted polypeptide sequence isorthologous to G1052 1483 PRT Oryza sativa Orthologous to G1052 1484 PRTOryza sativa Orthologous to G1052 1485 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1052 1486 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1052 1487 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1052 1488 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1052 1489 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1052 1490 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1052 1491 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1052 1492 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1052 1493 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1052 1494 DNA Glycine maxPredicted polypeptide sequence is orthologous to G1062 1495 DNA Glycinemax Predicted polypeptide sequence is orthologous to G1062 1496 DNAGlycine max Predicted polypeptide sequence is orthologous to G1062 1497DNA Glycine max Predicted polypeptide sequence is orthologous to G10621498 DNA Oryza sativa Predicted polypeptide sequence is orthologous toG1062 1499 DNA Oryza sativa Predicted polypeptide sequence isorthologous to G1062 1500 PRT Oryza sativa Orthologous to G1062 1501 DNAZea mays Predicted polypeptide sequence is orthologous to G1062 1502 DNAZea mays Predicted polypeptide sequence is orthologous to G1062 1503 DNAZea mays Predicted polypeptide sequence is orthologous to G1062 1504 DNAZea mays Predicted polypeptide sequence is orthologous to G1062 1505 DNAZea mays Predicted polypeptide sequence is orthologous to G1062 1506 DNAGlycine max Predicted polypeptide sequence is orthologous to G1069 1507DNA Glycine max Predicted polypeptide sequence is orthologous to G10691508 PRT Oryza sativa Orthologous to G1069 1509 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1069 1510 PRT Oryza sativaOrthologous to G1073 1511 PRT Oryza sativa Orthologous to G1073 1512 DNAGlycine max Predicted polypeptide sequence is orthologous to G1075 1513DNA Glycine max Predicted polypeptide sequence is orthologous to G10751514 DNA Glycine max Predicted polypeptide sequence is orthologous toG1075 1515 DNA Glycine max Predicted polypeptide sequence is orthologousto G1075 1516 DNA Glycine max Predicted polypeptide sequence isorthologous to G1075 1517 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G1075 1518 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G1075 1519 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G1075 1520 PRT Oryzasativa Orthologous to G1089 1521 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G1089 1522 DNA Zea mays Predicted polypeptidesequence is orthologous to G1089 1523 DNA Zea mays Predicted polypeptidesequence is orthologous to G1089 1524 DNA Zea mays Predicted polypeptidesequence is orthologous to G1089 1525 DNA Zea mays Predicted polypeptidesequence is orthologous to G1089 1526 DNA Zea mays Predicted polypeptidesequence is orthologous to G1089 1527 DNA Glycine max Predictedpolypeptide sequence is orthologous to G1134 1528 DNA Glycine maxPredicted polypeptide sequence is orthologous to G1134 1529 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G1134 1530 DNAGlycine max Predicted polypeptide sequence is orthologous to G1145 1531DNA Glycine max Predicted polypeptide sequence is orthologous to G11451532 DNA Glycine max Predicted polypeptide sequence is orthologous toG1145 1533 DNA Glycine max Predicted polypeptide sequence is orthologousto G1145 1534 DNA Glycine max Predicted polypeptide sequence isorthologous to G1145 1535 DNA Glycine max Predicted polypeptide sequenceis orthologous to G1145 1536 DNA Glycine max Predicted polypeptidesequence is orthologous to G1145 1537 DNA Glycine max Predictedpolypeptide sequence is orthologous to G1145 1538 PRT Oryza sativaOrthologous to G1145 1539 PRT Oryza sativa Orthologous to G1145 1540 PRTOryza sativa Orthologous to G1145 1541 PRT Oryza sativa Orthologous toG1145 1542 PRT Oryza sativa Orthologous to G1145 1543 PRT Oryza sativaOrthologous to G1145 1544 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G1145 1545 DNA Zea mays Predicted polypeptidesequence is orthologous to G1145 1546 DNA Zea mays Predicted polypeptidesequence is orthologous to G1145 1547 DNA Zea mays Predicted polypeptidesequence is orthologous to G1145 1548 DNA Zea mays Predicted polypeptidesequence is orthologous to G1145 1549 DNA Zea mays Predicted polypeptidesequence is orthologous to G1145 1550 DNA Glycine max Predictedpolypeptide sequence is orthologous to G1198 1551 DNA Glycine maxPredicted polypeptide sequence is orthologous to G1198 1552 DNA Glycinemax Predicted polypeptide sequence is orthologous to G1198 1553 DNAGlycine max Predicted polypeptide sequence is orthologous to G1198 1554DNA Glycine max Predicted polypeptide sequence is orthologous to G11981555 DNA Glycine max Predicted polypeptide sequence is orthologous toG1198 1556 DNA Glycine max Predicted polypeptide sequence is orthologousto G1198 1557 DNA Glycine max Predicted polypeptide sequence isorthologous to G1198 1558 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G1198 1559 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G1198 1560 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G1198 1561 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G1198 1562 DNAOryza sativa Predicted polypeptide sequence is orthologous to G1198 1563PRT Oryza sativa Orthologous to G1198 1564 PRT Oryza sativa Orthologousto G1198 1565 PRT Oryza sativa Orthologous to G1198 1566 PRT Oryzasativa Orthologous to G1198 1567 PRT Oryza sativa Orthologous to G11981568 PRT Oryza sativa Orthologous to G1198 1569 PRT Oryza sativaOrthologous to G1198 1570 DNA Zea mays Predicted polypeptide sequence isorthologous to G1198 1571 DNA Zea mays Predicted polypeptide sequence isorthologous to G1198 1572 DNA Zea mays Predicted polypeptide sequence isorthologous to G1198 1573 DNA Zea mays Predicted polypeptide sequence isorthologous to G1198 1574 DNA Zea mays Predicted polypeptide sequence isorthologous to G1198 1575 DNA Zea mays Predicted polypeptide sequence isorthologous to G1198 1576 DNA Zea mays Predicted polypeptide sequence isorthologous to G1198 1577 DNA Zea mays Predicted polypeptide sequence isorthologous to G1198 1578 DNA Zea mays Predicted polypeptide sequence isorthologous to G1198 1579 DNA Zea mays Predicted polypeptide sequence isorthologous to G1198 1580 DNA Glycine max Predicted polypeptide sequenceis orthologous to G1242 1581 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G1242 1582 PRT Oryza sativa Orthologous toG1242 1583 PRT Oryza sativa Orthologous to G1242 1584 DNA Zea maysPredicted polypeptide sequence is orthologous to G1242 1585 DNA Zea maysPredicted polypeptide sequence is orthologous to G1242 1586 DNA Glycinemax Predicted polypeptide sequence is orthologous to G1255 1587 DNAGlycine max Predicted polypeptide sequence is orthologous to G1255 1588DNA Glycine max Predicted polypeptide sequence is orthologous to G12551589 DNA Glycine max Predicted polypeptide sequence is orthologous toG1255 1590 DNA Glycine max Predicted polypeptide sequence is orthologousto G1255 1591 DNA Glycine max Predicted polypeptide sequence isorthologous to G1255 1592 DNA Glycine max Predicted polypeptide sequenceis orthologous to G1255 1593 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G1255 1594 PRT Oryza sativa Orthologous toG1255 1595 DNA Oryza sativa Predicted polypeptide sequence isorthologous to G1255 1596 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G1255 1597 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G1255 1598 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1255 1599 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1255 1600 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1255 1601 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1255 1602 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1255 1603 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1255 1604 DNA Glycine maxPredicted polypeptide sequence is orthologous to G1266 1605 DNA Glycinemax Predicted polypeptide sequence is orthologous to G1266 1606 DNAGlycine max Predicted polypeptide sequence is orthologous to G1266 1607DNA Glycine max Predicted polypeptide sequence is orthologous to G12661608 DNA Oryza sativa Predicted polypeptide sequence is orthologous toG1266 1609 DNA Glycine max Predicted polypeptide sequence is orthologousto G1274 1610 DNA Glycine max Predicted polypeptide sequence isorthologous to G1274 1611 PRT Oryza sativa Orthologous to G1274 1612 PRTOryza sativa Orthologous to G1274 1613 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1274 1614 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1274 1615 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1274 1616 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1274 1617 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G1275 1618 PRT Oryzasativa Orthologous to G1275 1619 PRT Oryza sativa Orthologous to G12751620 PRT Oryza sativa Orthologous to G1275 1621 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1275 1622 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1275 1623 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1275 1624 DNA Glycine maxPredicted polypeptide sequence is orthologous to G1313 1625 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G1313 1626 PRTOryza sativa Orthologous to G1313 1627 PRT Oryza sativa Orthologous toG1313 1628 DNA Zea mays Predicted polypeptide sequence is orthologous toG1313 1629 DNA Zea mays Predicted polypeptide sequence is orthologous toG1313 1630 DNA Zea mays Predicted polypeptide sequence is orthologous toG1313 1631 DNA Glycine max Predicted polypeptide sequence is orthologousto G1322 1632 DNA Glycine max Predicted polypeptide sequence isorthologous to G1322 1633 DNA Glycine max Predicted polypeptide sequenceis orthologous to G1322 1634 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G1322 1635 PRT Oryza sativa Orthologous toG1322 1636 PRT Oryza sativa Orthologous to G1322 1637 DNA Zea maysPredicted polypeptide sequence is orthologous to G1323 1638 DNA Zea maysPredicted polypeptide sequence is orthologous to G1323 1639 DNA Glycinemax Predicted polypeptide sequence is orthologous to G1417 1640 PRTOryza sativa Orthologous to G1417 1641 PRT Oryza sativa Orthologous toG1417 1642 DNA Glycine max Predicted polypeptide sequence is orthologousto G1449 1643 DNA Glycine max Predicted polypeptide sequence isorthologous to G1449 1644 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G1449 1645 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G1449 1646 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1449 1647 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1449 1648 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1449 1649 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1449 1650 DNA Glycine maxPredicted polypeptide sequence is orthologous to G1451 1651 DNA Glycinemax Predicted polypeptide sequence is orthologous to G1451 1652 DNAOryza sativa Predicted polypeptide sequence is orthologous to G1451 1653DNA Oryza sativa Predicted polypeptide sequence is orthologous to G14511654 DNA Oryza sativa Predicted polypeptide sequence is orthologous toG1451 1655 PRT Oryza sativa Orthologous to G1451 1656 PRT Oryza sativaOrthologous to G1451 1657 PRT Oryza sativa Orthologous to G1451 1658 PRTOryza sativa Orthologous to G1451 1659 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1451 1660 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1451 1661 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1451 1662 DNA Zea mays Predictedpolypeptide sequence is orthologous to G1451 1663 DNA Glycine maxPredicted polypeptide sequence is orthologous to G1482 1664 DNA Glycinemax Predicted polypeptide sequence is orthologous to G1482 1665 DNAGlycine max Predicted polypeptide sequence is orthologous to G1482 1666DNA Glycine max Predicted polypeptide sequence is orthologous to G14821667 DNA Glycine max Predicted polypeptide sequence is orthologous toG1482 1668 DNA Oryza sativa Predicted polypeptide sequence isorthologous to G1482 1669 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G1482 1670 DNA Oryza sativa Predictedpolypeptide sequence is orthologous to G1482 1671 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G1482 1672 PRT Oryzasativa Orthologous to G1482 1673 PRT Oryza sativa Orthologous to G14821674 DNA Zea mays Predicted polypeptide sequence is orthologous to G14821675 DNA Zea mays Predicted polypeptide sequence is orthologous to G14821676 DNA Zea mays Predicted polypeptide sequence is orthologous to G14821677 DNA Zea mays Predicted polypeptide sequence is orthologous to G14821678 DNA Zea mays Predicted polypeptide sequence is orthologous to G14821679 DNA Zea mays Predicted polypeptide sequence is orthologous to G14821680 PRT Oryza sativa Orthologous to G1499 1681 DNA Glycine maxPredicted polypeptide sequence is orthologous to G1540 1682 PRT Oryzasativa Orthologous to G1540 1683 DNA Glycine max Predicted polypeptidesequence is orthologous to G1560 1684 DNA Glycine max Predictedpolypeptide sequence is orthologous to G1560 1685 DNA Oryza sativaPredicted polypeptide sequence is orthologous to G1560 1686 PRT Oryzasativa Orthologous to G1560 1687 PRT Oryza sativa Orthologous to G15601688 PRT Oryza sativa Orthologous to G1560 1689 PRT Oryza sativaOrthologous to G1560 1690 DNA Zea mays Predicted polypeptide sequence isorthologous to G1560 1691 DNA Zea mays Predicted polypeptide sequence isorthologous to G1560 1692 DNA Zea mays Predicted polypeptide sequence isorthologous to G1560 1693 DNA Zea mays Predicted polypeptide sequence isorthologous to G1560 1694 DNA Zea mays Predicted polypeptide sequence isorthologous to G1560 1695 DNA Zea mays Predicted polypeptide sequence isorthologous to G1560 1696 PRT Oryza sativa Orthologous to G1645 1697 DNAZea mays Predicted polypeptide sequence is orthologous to G1645 1698 DNAZea mays Predicted polypeptide sequence is orthologous to G1645 1699 DNAZea mays Predicted polypeptide sequence is orthologous to G1645 1700 DNAGlycine max Predicted polypeptide sequence is orthologous to G1760 1701DNA Glycine max Predicted polypeptide sequence is orthologous to G17601702 PRT Oryza sativa Orthologous to G1760 1703 PRT Oryza sativaOrthologous to G1760 1704 DNA Zea mays Predicted polypeptide sequence isorthologous to G1760 1705 DNA Zea mays Predicted polypeptide sequence isorthologous to G1760 1706 DNA Zea mays Predicted polypeptide sequence isorthologous to G1760 1707 DNA Oryza sativa Predicted polypeptidesequence is orthologous to G1816 1708 DNA Glycine max Predictedpolypeptide sequence is orthologous to G2010, G2347 1709 DNA Oryzasativa Predicted polypeptide sequence is orthologous to G2010, G23471710 DNA Zea mays Predicted polypeptide sequence is orthologous to G20101711 DNA Zea mays Predicted polypeptide sequence is orthologous toG2010, G2347 1712 G5 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G974 1713 G5 PRT Arabidopsis Paralogous toG974 thaliana 1714 G9 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G867 1715 G9 PRT Arabidopsis Paralogous toG867 thaliana 1716 G12 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G24 1717 G12 PRT Arabidopsis Paralogous to G24thaliana 1718 G29 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G46 1719 G29 PRT Arabidopsis Paralogous to G46thaliana 1720 G40 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G912 1721 G40 PRT Arabidopsis Paralogous toG912 thaliana 1722 G41 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G912 1723 G41 PRT Arabidopsis Paralogous toG912 thaliana 1724 G42 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G912 1725 G42 PRT Arabidopsis Paralogous toG912 thaliana 1726 G43 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G46 1727 G43 PRT Arabidopsis Paralogous to G46thaliana 1728 G149 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G627 1729 G149 PRT Arabidopsis Paralogous toG627 thaliana 1730 G152 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1760 1731 G152 PRT Arabidopsis Paralogous toG1760 thaliana 1732 G153 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1760 1733 G153 PRT Arabidopsis Paralogous toG1760 thaliana 1734 G157 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G859, G1842, G1843 1735 G157 PRT ArabidopsisParalogous to G859, thaliana G1842, G1843 1736 G175 DNA ArabidopsisPredicted polypeptide thaliana sequence is paralogous to G877 1737 G175PRT Arabidopsis Paralogous to G877 thaliana 1738 G182 DNA ArabidopsisPredicted polypeptide thaliana sequence is paralogous to G196 1739 G182PRT Arabidopsis Paralogous to G196 thaliana 1740 G197 DNA ArabidopsisPredicted polypeptide thaliana sequence is paralogous to G664 1741 G197PRT Arabidopsis Paralogous to G664 thaliana 1742 G212 DNA ArabidopsisPredicted polypeptide thaliana sequence is paralogous to G676 1743 G212PRT Arabidopsis Paralogous to G676 thaliana 1744 G214 DNA ArabidopsisPredicted polypeptide thaliana sequence is paralogous to G680 1745 G214PRT Arabidopsis Paralogous to G680 thaliana 1746 G221 DNA ArabidopsisPredicted polypeptide thaliana sequence is paralogous to G1322 1747 G221PRT Arabidopsis Paralogous to G1322 thaliana 1748 G225 DNA ArabidopsisPredicted polypeptide thaliana sequence is paralogous to G226, G682,G1816, G2718 1749 G225 PRT Arabidopsis Paralogous to G226, thalianaG682, G1816, G2718 1750 G226 DNA Arabidopsis Predicted polypeptidethaliana sequence is paralogous to G682, G1816, G2718 1751 G226 PRTArabidopsis Paralogous to G682, thaliana G1816, G2718 1752 G228 DNAArabidopsis Predicted polypeptide thaliana sequence is paralogous toG254 1753 G228 PRT Arabidopsis Paralogous to G254 thaliana 1754 G233 DNAArabidopsis Predicted polypeptide thaliana sequence is paralogous toG241 1755 G233 PRT Arabidopsis Paralogous to G241 thaliana 1756 G247 DNAArabidopsis Predicted polypeptide thaliana sequence is paralogous toG676 1757 G247 PRT Arabidopsis Paralogous to G676 thaliana 1758 G249 DNAArabidopsis Predicted polypeptide thaliana sequence is paralogous toG1322 1759 G249 PRT Arabidopsis Paralogous to G1322 thaliana 1760 G255DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG664 1761 G255 PRT Arabidopsis Paralogous to G664 thaliana 1762 G350 DNAArabidopsis Predicted polypeptide thaliana sequence is paralogous toG545 1763 G350 PRT Arabidopsis Paralogous to G545 thaliana 1764 G351 DNAArabidopsis Predicted polypeptide thaliana sequence is paralogous toG545 1765 G351 PRT Arabidopsis Paralogous to G545 thaliana 1766 G362 DNAArabidopsis Predicted polypeptide thaliana sequence is paralogous toG361 1767 G362 PRT Arabidopsis Paralogous to G361 thaliana 1768 G370 DNAArabidopsis Predicted polypeptide thaliana sequence is paralogous toG361 1769 G370 PRT Arabidopsis Paralogous to G361 thaliana 1770 G390 DNAArabidopsis Predicted polypeptide thaliana sequence is paralogous toG438 1771 G390 PRT Arabidopsis Paralogous to G438 thaliana 1772 G391 DNAArabidopsis Predicted polypeptide thaliana sequence is paralogous toG390, G438 1773 G391 PRT Arabidopsis Paralogous to G390, G438 thaliana1774 G392 DNA Arabidopsis Predicted polypeptide thaliana sequence isparalogous to G390, G438 1775 G392 PRT Arabidopsis Paralogous to G390,G438 thaliana 1776 G438 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G390 1777 G438 PRT Arabidopsis Paralogous toG390 thaliana 1778 G440 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G864 1779 G440 PRT Arabidopsis Paralogous toG864 thaliana 1780 G463 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G464 1781 G463 PRT Arabidopsis Paralogous toG464 thaliana 1782 G481 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G482 1783 G481 PRT Arabidopsis Paralogous toG482 thaliana 1784 G485 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G482 1785 G485 PRT Arabidopsis Paralogous toG482 thaliana 1786 G554 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1198 1787 G554 PRT Arabidopsis Paralogous toG1198 thaliana 1788 G555 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1198 1789 G555 PRT Arabidopsis Paralogous toG1198 thaliana 1790 G556 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1198 1791 G556 PRT Arabidopsis Paralogous toG1198 thaliana 1792 G558 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1198 1793 G558 PRT Arabidopsis Paralogous toG1198 thaliana 1794 G578 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1198 1795 G578 PRT Arabidopsis Paralogous toG1198 thaliana 1796 G610 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G849 1797 G610 PRT Arabidopsis Paralogous toG849 thaliana 1798 G629 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1198 1799 G629 PRT Arabidopsis Paralogous toG1198 thaliana 1800 G659 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1323 1801 G659 PRT Arabidopsis Paralogous toG1323 thaliana 1802 G666 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G256 1803 G666 PRT Arabidopsis Paralogous toG256 thaliana 1804 G668 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G256 1805 G668 PRT Arabidopsis Paralogous toG256 thaliana 1806 G680 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G214 1807 G680 PRT Arabidopsis Paralogous toG214 thaliana 1808 G682 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G226, G1816, G2718 1809 G682 PRT ArabidopsisParalogous to G226, G1816, G2718 thaliana 1810 G714 DNA ArabidopsisPredicted polypeptide thaliana sequence is paralogous to G489 1811 G714PRT Arabidopsis Paralogous to G489 thaliana 1812 G859 DNA ArabidopsisPredicted polypeptide thaliana sequence is paralogous to G157, G1842,G1843 1813 G859 PRT Arabidopsis Paralogous to G157, G1842, G1843thaliana 1814 G860 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1760 1815 G860 PRT Arabidopsis Paralogous toG1760 thaliana 1816 G913 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G912 1817 G913 PRT Arabidopsis Paralogous toG912 thaliana 1818 G914 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G971 1819 G914 PRT Arabidopsis Paralogous toG971 thaliana 1820 G932 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G256 1821 G932 PRT Arabidopsis Paralogous toG256 thaliana 1822 G957 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G961 1823 G957 PRT Arabidopsis Paralogous toG961 thaliana 1824 G986 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G881 1825 G986 PRT Arabidopsis Paralogous toG881 thaliana 1826 G990 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1451 1827 G990 PRT Arabidopsis Paralogous toG1451 thaliana 1828 G993 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G867 1829 G993 PRT Arabidopsis Paralogous toG867 thaliana 1830 G1004 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G46 1831 G1004 PRT Arabidopsis Paralogous toG46 thaliana 1832 G1006 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G28 1833 G1006 PRT Arabidopsis Paralogous toG28 thaliana 1834 G1051 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1052 1835 G1051 PRT Arabidopsis Paralogous toG1052 thaliana 1836 G1056 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1145 1837 G1056 PRT Arabidopsis Paralogous toG1145 thaliana 1838 G1067 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1073 1839 G1067 PRT Arabidopsis Paralogous toG1073 thaliana 1840 G1076 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1075 1841 G1076 PRT Arabidopsis Paralogous toG1075 thaliana 1842 G1136 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G584 1843 G1136 PRT Arabidopsis Paralogous toG584 thaliana 1844 G1211 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G291 1845 G1211 PRT Arabidopsis Paralogous toG291 thaliana 1846 G1243 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1242 1847 G1243 PRT Arabidopsis Paralogous toG1242 thaliana 1848 G1277 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G24 1849 G1277 PRT Arabidopsis Paralogous toG24 thaliana 1850 G1325 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1313 1851 G1325 PRT Arabidopsis Paralogous toG1313 thaliana 1852 G1329 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G663 1853 G1329 PRT Arabidopsis Paralogous toG663 thaliana 1854 G1349 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G896 1855 G1349 PRT Arabidopsis Paralogous toG896 thaliana 1856 G1364 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G482 1857 G1364 PRT Arabidopsis Paralogous toG482 thaliana 1858 G1379 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G24 1859 G1379 PRT Arabidopsis Paralogous toG24 thaliana 1860 G1387 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G975 1861 G1387 PRT Arabidopsis Paralogous toG975 thaliana 1862 G1419 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G46 1863 G1419 PRT Arabidopsis Paralogous toG46 thaliana 1864 G1484 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G1255 1865 G1484 PRT Arabidopsis Paralogous toG1255 thaliana 1866 G1494 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G789 1867 G1494 PRT Arabidopsis Paralogous toG789 thaliana 1868 G1548 DNA Arabidopsis Predicted polypeptide thalianasequence is paralogous to G390, G438 1869 G1548 PRT ArabidopsisParalogous to G390, G438 thaliana 1870 G1664 DNA Arabidopsis Predictedpolypeptide thaliana sequence is paralogous to G1062 1871 G1664 PRTArabidopsis Paralogous to G1062 thaliana 1872 G1750 DNA ArabidopsisPredicted polypeptide thaliana sequence is paralogous to G864 1873 G1750PRT Arabidopsis Paralogous to G864 thaliana 1874 G1759 DNA ArabidopsisPredicted polypeptide thaliana sequence is paralogous to G157, G859,G1842, G1843 1875 G1759 PRT Arabidopsis Paralogous to G157, thalianaG859, G1842, G1843 1876 G1785 DNA Arabidopsis Predicted polypeptidethaliana sequence is paralogous to G248 1877 G1785 PRT ArabidopsisParalogous to G248 thaliana 1878 G1806 DNA Arabidopsis Predictedpolypeptide thaliana sequence is paralogous to G1198 1879 G1806 PRTArabidopsis Paralogous to G1198 thaliana 1880 G1816 DNA ArabidopsisPredicted polypeptide thaliana sequence is paralogous to G226, G682,G2718 1881 G1816 PRT Arabidopsis Paralogous to G226, thaliana G682,G2718 1882 G1842 DNA Arabidopsis Predicted polypeptide thaliana sequenceis paralogous to G157, G859, G1843 1883 G1842 PRT Arabidopsis Paralogousto G157, thaliana G859, G1843 1884 G1843 DNA Arabidopsis Predictedpolypeptide thaliana sequence is paralogous to G157, G859, G1842 1885G1843 PRT Arabidopsis Paralogous to G157, thaliana G859, G1842 1886G1844 DNA Arabidopsis Predicted polypeptide thaliana sequence isparalogous to G157, G859, G1842, G1843 1887 G1844 PRT ArabidopsisParalogous to G157, thaliana G859, G1842, G1843 1888 G1887 DNAArabidopsis Predicted polypeptide thaliana sequence is paralogous toG896 1889 G1887 PRT Arabidopsis Paralogous to G896 thaliana 1890 G1888DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG1482 1891 G1888 PRT Arabidopsis Paralogous to G1482 thaliana 1892 G1930DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG867 1893 G1930 PRT Arabidopsis Paralogous to G867 thaliana 1894 G1995DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG361 1895 G1995 PRT Arabidopsis Paralogous to G361 thaliana 1896 G1998DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG325 1897 G1998 PRT Arabidopsis Paralogous to G325 thaliana 1898 G2010DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG2347 1899 G2010 PRT Arabidopsis Paralogous to G2347 thaliana 1900 G2106DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG979 1901 G2106 PRT Arabidopsis Paralogous to G979 thaliana 1902 G2107DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG912 1903 G2107 PRT Arabidopsis Paralogous to G912 thaliana 1904 G2131DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG979 1905 G2131 PRT Arabidopsis Paralogous to G979 thaliana 1906 G2153DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG1069 1907 G2153 PRT Arabidopsis Paralogous to G1069 thaliana 1908 G2156DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG1073 1909 G2156 PRT Arabidopsis Paralogous to G1073 thaliana 1910 G2345DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG482 1911 G2345 PRT Arabidopsis Paralogous to G482 thaliana 1912 G2347DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG2010 1913 G2347 PRT Arabidopsis Paralogous to G2010 thaliana 1914 G2421DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG663 1915 G2421 PRT Arabidopsis Paralogous to G663 thaliana 1916 G2422DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG663 1917 G2422 PRT Arabidopsis Paralogous to G663 thaliana 1918 G2424DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG1645 1919 G2424 PRT Arabidopsis Paralogous to G1645 thaliana 1920 G2432DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG736 1921 G2432 PRT Arabidopsis Paralogous to G736 thaliana 1922 G2513DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG912 1923 G2513 PRT Arabidopsis Paralogous to G912 thaliana 1924 G2535DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG961 1925 G2535 PRT Arabidopsis Paralogous to G961 thaliana 1926 G2555DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG1134 1927 G2555 PRT Arabidopsis Paralogous to G1134 thaliana 1928 G2583DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG975 1929 G2583 PRT Arabidopsis Paralogous to G975 thaliana 1930 G2701DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG1634 1931 G2701 PRT Arabidopsis Paralogous to G1634 thaliana 1932 G2718DNA Arabidopsis Predicted polypeptide thaliana sequence is paralogous toG226, G682, G1816 1933 G2718 PRT Arabidopsis Paralogous to G226,thaliana G682, G1816 1934 G2826 DNA Arabidopsis Predicted polypeptidethaliana sequence is paralogous to G361 1935 G2826 PRT ArabidopsisParalogous to G361 thaliana 1936 G2838 DNA Arabidopsis Predictedpolypeptide thaliana sequence is paralogous to G361 1937 G2838 PRTArabidopsis Paralogous to G361 thaliana 1938 G3010 DNA ArabidopsisPredicted polypeptide thaliana sequence is paralogous to G987 1939 G3010PRT Arabidopsis Paralogous to G987 thaliana 1940 bnCBF1 DNA BrassicaPredicted polypeptide napus sequence is orthologous to G40, G41, G42,G912, G2107, G2513 1941 bnCBF1 PRT Brassica Orthologous to G40, napusG41, G42, G912, G2107, G2513 1971 Soy PRT Glycine max Predictedpolypeptide MADS 1 sequence is orthologous to G157, G859, G1842, G18431973 Soy PRT Glycine max Predicted polypeptide MADS 3 sequence isorthologous to G157, G859, G1842, G1843Molecular Modeling

Another means that may be used to confirm the utility and function oftranscription factor sequences that are orthologous or paralogous topresently disclosed transcription factors is through the use ofmolecular modeling software. Molecular modeling is routinely used topredict polypeptide structure, and a variety of protein structuremodeling programs, such as “Insight II” (Accelrys, Inc.) arecommercially available for this purpose. Modeling can thus be used topredict which residues of a polypeptide can be changed without alteringfunction (Crameri et al. (2003) U.S. Pat. No. 6, 521, 453). Thus,polypeptides that are sequentially similar can be shown to have a highlikelihood of similar function by their structural similarity, whichmay, for example, be established by comparison of regions ofsuperstructure. The relative tendencies of amino acids to form regionsof superstructure (for example, helixes and β-sheets) are wellestablished. For example, O'Neil et al. (1990) Science 250: 646-651)have discussed in detail the helix forming tendencies of amino acids.Tables of relative structure forming activity for amino acids can beused as substitution tables to predict which residues can befunctionally substituted in a given region, for example, in DNA-bindingdomains of known transcription factors and equivalogs. Homologs that arelikely to be functionally similar can then be identified.

Of particular interest is the structure of a transcription factor in theregion of its conserved domain, such as those identified in Table 5.Structural analyses may be performed by comparing the structure of theknown transcription factor around its conserved domain with those oforthologs and paralogs. Analysis of a number of polypeptides within atranscription factor group or lade, including the functionally orsequentially similar polypeptides provided in the Sequence Listing, mayalso provide an understanding of structural elements required toregulate transcription within a given family.

EXAMPLES

The invention, now being generally described, will be more readilyunderstood by reference to the following examples, which are includedmerely for purposes of illustration of certain aspects and embodimentsof the present invention and are not intended to limit the invention. Itwill be recognized by one of skill in the art that a transcriptionfactor that is associated with a particular first trait may also beassociated with at least one other, unrelated and inherent second traitwhich was not predicted by the first trait.

The complete descriptions of the traits associated with eachpolynucleotide of the invention are fully disclosed in Table 4 and Table6. The complete description of the transcription factor gene family andidentified conserved domains of the polypeptide encoded by thepolynucleotide is fully disclosed in Tables 5A and 5B.

Example I: Full Length Gene Identification and Cloning

Putative transcription factor sequences (genomic or ESTs) related toknown transcription factors were identified in the Arabidopsis thalianaGenBank database using the tblastn sequence analysis program usingdefault parameters and a P-value cutoff threshold of −4 or −5 or lower,depending on the length of the query sequence. Putative transcriptionfactor sequence hits were then screened to identify those containingparticular sequence strings. If the sequence hits contained suchsequence strings, the sequences were confirmed as transcription factors.

Alternatively, Arabidopsis thaliana cDNA libraries derived fromdifferent tissues or treatments, or genomic libraries were screened toidentify novel members of a transcription family using a low stringencyhybridization approach. Probes were synthesized using gene specificprimers in a standard PCR reaction (annealing temperature 60° C.) andlabeled with ³²P dCTP using the High Prime DNA Labeling Kit (BoehringerMannheim Corp. (now Roche Diagnostics Corp., Indianapolis, Ind.).Purified radiolabelled probes were added to filters immersed in Churchhybridization medium (0.5 M NaPO₄ pH 7.0, 7% SDS, 1% w/v bovine serumalbumin) and hybridized overnight at 60° C. with shaking. Filters werewashed two times for 45 to 60 minutes with 1×SCC, 1% SDS at 6° C.

To identify additional sequence 5′ or 3′ of a partial cDNA sequence in acDNA library, 5′ and 3′ rapid amplification of cDNA ends (RACE) wasperformed using the MARATHON cDNA amplification kit (Clontech, PaloAlto, Calif.). Generally, the method entailed first isolating poly(A)mRNA, performing first and second strand cDNA synthesis to generatedouble stranded cDNA, blunting cDNA ends, followed by ligation of theMARATHON Adaptor to the cDNA to form a library of adaptor-ligated dscDNA.

Gene-specific primers were designed to be used along with adaptorspecific primers for both 5′ and 3′ RACE reactions. Nested primers,rather than single primers, were used to increase PCR specificity. Using5′ and 3′ RACE reactions, 5′ and 3′ RACE fragments were obtained,sequenced and cloned. The process can be repeated until 5′ and 3′ endsof the full-length gene were identified. Then the full-length cDNA wasgenerated by PCR using primers specific to 5′ and 3′ ends of the gene byend-to-end PCR.

Example II Construction of Expression Vectors

The sequence was amplified from a genomic or cDNA library using primersspecific to sequences upstream and downstream of the coding region. Theexpression vector was pMEN20 or pMEN65, which are both derived frompMON316 (Sanders et al. (1987) Nucleic Acids Res. 15:1543-1558) andcontain the CaMV 35S promoter to express transgenes. To clone thesequence into the vector, both pMEN20 and the amplified DNA fragmentwere digested separately with SalI and NotI restriction enzymes at 37°C. for 2 hours. The digestion products were subject to electrophoresisin a 0.8% agarose gel and visualized by ethidium bromide staining. TheDNA fragments containing the sequence and the linearized plasmid wereexcised and purified by using a QIAQUICK gel extraction kit (Qiagen,Valencia, Calif.). The fragments of interest were ligated at a ratio of3:1 (vector to insert). Ligation reactions using T4 DNA ligase (NewEngland Biolabs, Beverly Mass.) were carried out at 16° C. for 16 hours.The ligated DNAs were transformed into competent cells of the E. colistrain DH5alpha by using the heat shock method. The transformations wereplated on LB plates containing 50 mg/l kanamycin (Sigma Chemical Co. St.Louis Mo.). Individual colonies were grown overnight in five millilitersof LB broth containing 50 mg/l kanamycin at 370 C. Plasmid DNA waspurified by using Qiaquick Mini Prep kits (Qiagen, Valencia Calif.).

Example III Transformation of Agrobacterium with the Expression Vector

After the plasmid vector containing the gene was constructed, the vectorwas used to transform Agrobacterium tumefaciens cells expressing thegene products. The stock of Agrobacterium tumefaciens cells fortransformation were made as described by Nagel et al. (1990) FEMSMicrobiol Letts. 67: 325328. Agrobacterium strain ABI was grown in 250ml LB medium (Sigma) overnight at 28° C. with shaking until anabsorbance over 1 cm at 600 nm (A₆₀₀) of 0.5-1.0 was reached. Cells wereharvested by centrifugation at 4,000×g for 15 min at 4° C. Cells werethen resuspended in 250 μl chilled buffer (1 mM HEPES, pH adjusted to7.0 with KOH). Cells were centrifuged again as described above andresuspended in 125 μl chilled buffer. Cells were then centrifuged andresuspended two more times in the same HEPES buffer as described aboveat a volume of 100 μl and 750 μl, respectively. Resuspended cells werethen distributed into 40 μl aliquots, quickly frozen in liquid nitrogen,and stored at −80° C.

Agrobacterium cells were transformed with plasmids prepared as describedabove following the protocol described by Nagel et al. (supra). For eachDNA construct to be transformed, 50-100 ng DNA (generally resuspended in10 mM Tris-HCl, 1 mM EDTA, pH 8.0) was mixed with 40 μl of Agrobacteriumcells. The DNA/cell mixture was then transferred to a chilled cuvettewith a 2 mm electrode gap and subject to a 2.5 kV charge dissipated at25 μF and 200 μF using a Gene Pulser II apparatus (Bio-Rad, Hercules,Calif.). After electroporation, cells were immediately resuspended in1.0 ml LB and allowed to recover without antibiotic selection for 2-4hours at 28° C. in a shaking incubator. After recovery, cells wereplated onto selective medium of LB broth containing 100 μg/mlspectinomycin (Sigma) and incubated for 24-48 hours at 280 C. Singlecolonies were then picked and inoculated in fresh medium. The presenceof the plasmid construct was verified by PCR amplification and sequenceanalysis.

Example IV Transformation of Arabidopsis Plants with Agrobacteriumtumefaciens with Expression Vector

After transformation of Agrobacterium tumefaciens with plasmid vectorscontaining the gene, single Agrobacterium colonies were identified,propagated, and used to transform Arabidopsis plants. Briefly, 500 mlcultures of LB medium containing 50 mg/l kanamycin were inoculated withthe colonies and grown at 28° C. with shaking for 2 days until anoptical absorbance at 600 nm wavelength over 1 cm (A₆₀₀) of >2.0 isreached. Cells were then harvested by centrifugation at 4,000×g for 10min, and resuspended in infiltration medium (1/2 X Murashige and Skoogsalts (Sigma), 1×Gamborg's B-5 vitamins (Sigma), 5.0% (w/v) sucrose(Sigma), 0.044 1M benzylamino purine (Sigma), 200 μl/l Silwet L77 (LehleSeeds) until an A₆₀₀ of 0.8 was reached.

Prior to transformation, Arabidopsis thaliana seeds (ecotype Columbia)were sown at a density of −10 plants per 4” pot onto Pro-Mix BX pottingmedium (Hummert International) covered with fiberglass mesh (18 mm×16mm). Plants were grown under continuous illumination (50-75 [E/m²/sec)at 22-23° C. with 65-70% relative humidity. After about 4 weeks, primaryinflorescence stems (bolts) are cut off to encourage growth of multiplesecondary bolts. After flowering of the mature secondary bolts, plantswere prepared for transformation by removal of all siliques and openedflowers.

The pots were then immersed upside down in the mixture of Agrobacteriuminfiltration medium as described above for 30 sec, and placed on theirsides to allow draining into a 1′×2′ flat surface covered with plasticwrap. After 24 h, the plastic wrap was removed and pots are turnedupright. The immersion procedure was repeated one week later, for atotal of two immersions per pot. Seeds were then collected from eachtransformation pot and analyzed following the protocol described below.

Example V Identification of Arabidopsis Primary Transformants

Seeds collected from the transformation pots were sterilized essentiallyas follows. Seeds were dispersed into in a solution containing 0.1%(v/v) Triton X-100 (Sigma) and sterile water and washed by shaking thesuspension for 20 min. The wash solution was then drained and replacedwith fresh wash solution to wash the seeds for 20 min with shaking.After removal of the ethanol/detergent solution, a solution containing0.1% (v/v) Triton X-100 and 30% (v/v) bleach (CLOROX; Clorox Corp.Oakland Calif.) was added to the seeds, and the suspension was shakenfor 10 min. After removal of the bleach/detergent solution, seeds werethen washed five times in sterile distilled water. The seeds were storedin the last wash water at 4° C. for 2 days in the dark before beingplated onto antibiotic selection medium (1× Murashige and Skoog salts(pH adjusted to 5.7 with 1M KOH), 1×Gamborg's B-5 vitamins, 0.9%phytagar (Life Technologies), and 50 mg/l kanamycin). Seeds weregerminated under continuous illumination (50-75 μE/m²/sec) at 22-23° C.After 7-10 days of growth under these conditions, kanamycin resistantprimary transformants (T₁ generation) were visible and obtained. Theseseedlings were transferred first to fresh selection plates where theseedlings continued to grow for 3-5 more days, and then to soil (Pro-MixBX potting medium).

Primary transformants were crossed and progeny seeds (T₂) collected;kanamycin resistant seedlings were selected and analyzed. The expressionlevels of the recombinant polynucleotides in the transformants variesfrom about a 5% expression level increase to a least a 100% expressionlevel increase. Similar observations are made with respect topolypeptide level expression.

Example VI Identification of Arabidopsis Plants with TranscriptionFactor Gene Knockouts

The screening of insertion mutagenized Arabidopsis collections for nullmutants in a known target gene was essentially as described in Krysan etal. (1999) Plant Cell 11: 2283-2290. Briefly, gene-specific primers,nested by 5-250 base pairs to each other, were designed from the 5′ and3′ regions of a known target gene. Similarly, nested sets of primerswere also created specific to each of the T-DNA or transposon ends (the“right” and “left” borders). All possible combinations of gene specificand T-DNA/transposon primers were used to detect by PCR an insertionevent within or close to the target gene. The amplified DNA fragmentswere then sequenced which allows the precise determination of theT-DNA/transposon insertion point relative to the target gene. Insertionevents within the coding or intervening sequence of the genes weredeconvoluted from a pool comprising a plurality of insertion events to asingle unique mutant plant for functional characterization. The methodis described in more detail in Yu and Adam, U.S. application Ser. No.09/177,733 filed Oct. 23, 1998.

Example VII Identification of Modified Phenotypes in Overexpression orGene Knockout Plants

Experiments were performed to identify those transformants or knockoutsthat exhibited modified biochemical characteristics. Among thebiochemicals that were assayed were insoluble sugars, such as arabinose,fucose, galactose, mannose, rhamnose or xylose or the like; prenyllipids, such as lutein, betacarotene, xanthophyll-1, xanthophyl]-2,chlorophylls A or B, or alpha-, delta- or gamma-tocopherol or the like;fatty acids, such as 16:0 (palmitic acid), 16:1 (palmitoleic acid), 18:0(stearic acid), 18:1 (oleic acid), 18:2 (linoleic acid), 20:0, 18:3(linolenic acid), 20:1 (eicosenoic acid), 20:2, 22:1 (erucic acid) orthe like; waxes, such as by altering the levels of C29, C31, or C₃₋₃alkanes; sterols, such as brassicasterol, campesterol, stigmasterol,sitosterol or stigmastanol or the like, glucosinolates, protein or oillevels.

Fatty acids were measured using two methods depending on whether thetissue was from leaves or seeds. For leaves, lipids were extracted andesterified with hot methanolic H₂SO₄ and partitioned into hexane frommethanolic brine. For seed fatty acids, seeds were pulverized andextracted in methanol:heptane:toluene:2,2-dimethoxypropane:H₂SO₄(39:34:20:5:2) for 90 minutes at 80° C. After cooling to roomtemperature the upper phase, containing the seed fatty acid esters, wassubjected to GC analysis. Fatty acid esters from both seed and leaftissues were analyzed with a SUPELCO SP-2330 column (Supelco,Bellefonte, Pa.).

Glucosinolates were purified from seeds or leaves by first heating thetissue at 95° C. for 10 minutes. Preheated ethanol:water (50:50) isadded and after heating at 95° C. for a further 10 minutes, theextraction solvent is applied to a DEAE Sephadex column (Pharmacia)which had been previously equilibrated with 0.5 M pyridine acetate.Desulfoglucosinolates were eluted with 300 ul water and analyzed byreverse phase HPLC monitoring at 226 nm.

For wax alkanes, samples were extracted using an identical method asfatty acids and extracts were analyzed on a HP 5890 GC coupled with a5973 MSD. Samples were chromatographically isolated on a J&W DB35 massspectrometer (J&W Scientific Agilent Technologies, Folsom, Calif.).

To measure prenyl lipid levels, seeds or leaves were pulverized with 1to 2% pyrogallol as an antioxidant. For seeds, extracted samples werefiltered and a portion removed for tocopherol and carotenoid/chlorophyllanalysis by HPLC. The remaining material was saponified for steroldetermination. For leaves, an aliquot was removed and diluted withmethanol and chlorophyll A, chlorophyll B, and total carotenoidsmeasured by spectrophotometry by determining optical absorbance at 665.2nm, 652.5 nm, and 470 nm. An aliquot was removed for tocopherol andcarotenoid/chlorophyll composition by HPLC using a Waters μBondapak C18column (4.6 mm×150 mm). The remaining methanolic solution was saponifiedwith 10% KOH at 80° C. for one hour. The samples were cooled and dilutedwith a mixture of methanol and water. A solution of 2% methylenechloride in hexane was mixed in and the samples were centrifuged. Theaqueous methanol phase was again re-extracted 2% methylene chloride inhexane and, after centrifugation, the two upper phases were combined andevaporated. 2% methylene chloride in hexane was added to the tubes andthe samples were then extracted with one ml of water. The upper phasewas removed, dried, and resuspended in 400 ul of 2% methylene chloridein hexane and analyzed by gas chromatography using a 50 m DB-5 ms (0.25mm ID, 0.25 μm phase, J&W Scientific).

Insoluble sugar levels were measured by the method essentially describedby Reiter et al. (1999), Plant J. 12: 335-345. This method analyzes theneutral sugar composition of cell wall polymers found in Arabidopsisleaves. Soluble sugars were separated from sugar polymers by extractingleaves with hot 70% ethanol. The remaining residue containing theinsoluble polysaccharides was then acid hydrolyzed with allose added asan internal standard. Sugar monomers generated by the hydrolysis werethen reduced to the corresponding alditols by treatment with NaBH4, thenwere acetylated to generate the volatile alditol acetates which werethen analyzed by GC-FID. Identity of the peaks was determined bycomparing the retention times of known sugars converted to thecorresponding alditol acetates with the retention times of peaks fromwild-type plant extracts. Alditol acetates were analyzed on a SupelcoSP-2330 capillary column (30 m×250 μm×0.2 μm) using a temperatureprogram beginning at 180° C. for 2 minutes followed by an increase to220° C. in 4 minutes. After holding at 220° C. for 10 minutes, the oventemperature is increased to 240° C. in 2 minutes and held at thistemperature for 10 minutes and brought back to room temperature.

To identify plants with alterations in total seed oil or proteincontent, 150 mg of seeds from T2 progeny plants were subjected toanalysis by Near Infrared Reflectance Spectroscopy (NIRS) using a FossNirSystems Model 6500 with a spinning cup transport system. NIRS is anon-destructive analytical method used to determine seed oil and proteincomposition. Infrared is the region of the electromagnetic spectrumlocated after the visible region in the direction of longer wavelengths.‘Near infrared’ owns its name for being the infrared region near to thevisible region of the electromagnetic spectrum. For practical purposes,near infrared comprises wavelengths between 800 and 2500 nm. NIRS isapplied to organic compounds rich in O—H bonds (such as moisture,carbohydrates, and fats), C—H bonds (such as organic compounds andpetroleum derivatives), and N—H bonds (such as proteins and aminoacids). The NIRS analytical instruments operate by statisticallycorrelating NIRS signals at several wavelengths with the characteristicor property intended to be measured. All biological substances containthousands of C—H, O—H, and N—H bonds. Therefore, the exposure to nearinfrared radiation of a biological sample, such as a seed, results in acomplex spectrum which contains qualitative and quantitative informationabout the physical and chemical composition of that sample.

The numerical value of a specific analyte in the sample, such as proteincontent or oil content, is mediated by a calibration approach known aschemometrics. Chemometrics applies statistical methods such as multiplelinear regression (MLR), partial least squares (PLS), and principlecomponent analysis (PCA) to the spectral data and correlates them with aphysical property or other factor, that property or factor is directlydetermined rather than the analyte concentration itself. The methodfirst provides “wet chemistry” data of the samples required to developthe calibration.

Calibration of NIRS response was performed using data obtained by wetchemical analysis of a population of Arabidopsis ecotypes that wereexpected to represent diversity of oil and protein levels.

The exact oil composition of each ecotype used in the calibrationexperiment was performed using gravimetric analysis of oils extractedfrom seed samples (0.5 g or 1.0 g) by the accelerated solvent extractionmethod (ASE; Dionex Corp, Sunnyvale, Calif.). The extraction method wasvalidated against certified canola samples (Community Bureau ofReference, Belgium). Seed samples from each ecotype (0.5 g or 1g) weresubjected to accelerated solvent extraction and the resulting extractedoil weights compared to the weight of oil recovered from canola seedthat has been certified for oil content (Community Bureau of Reference).The oil calibration equation was based on 57 samples with a range of oilcontents from 27.0% to 50.8%. To check the validity of the calibrationcurve, an additional set of samples was extracted by ASE and predictedusing the oil calibration equation. This validation set counted 46samples, ranging from 27.9% to 47.5% oil, and had a predicted standarderror of performance of 0.63%. The wet chemical method for protein waselemental analysis (% N X 6.0) using the average of 3 representativesamples of 5 mg each validated against certified ground corn (NIST). Theinstrumentation was an Elementar Vario-EL III elemental analyzeroperated in CNS operating mode (Elementar Analysensysteme GmbH, Hanau,Germany).

The protein calibration equation was based on a library of 63 sampleswith a range of protein contents from 17.4% to 31.2%. An additional setof samples was analyzed for protein by elemental analysis (n=57) andscanned by NIRS in order to validate the protein prediction equation.The protein range of the validation set was from 16.8% to 31.2% and thestandard error of prediction was 0.468%.

NIRS analysis of Arabidopsis seed was carried out on between 40-300 mgexperimental sample. The oil and protein contents were predicted usingthe respective calibration equations.

Data obtained from NIRS analysis was analyzed statistically using anearest-neighbor (N—N) analysis. The N—N analysis allows removal ofwithin-block spatial variability in a fairly flexible fashion, whichdoes not require prior knowledge of the pattern of variability in thechamber. Ideally, all hybrids are grown under identical experimentalconditions within a block (rep). In reality, even in many block designs,significant within-block variability exists. Nearest-neighbor proceduresare based on assumption that environmental effect of a plot is closelyrelated to that of its neighbors. Nearest-neighbor methods useinformation from adjacent plots to adjust for within-block heterogeneityand so provide more precise estimates of treatment means anddifferences. If there is within-plot heterogeneity on a spatial scalethat is larger than a single plot and smaller than the entire block,then yields from adjacent plots will be positively correlated.Information from neighboring plots can be used to reduce or remove theunwanted effect of the spatial heterogeneity, and hence improve theestimate of the treatment effect. Data from neighboring plots can alsobe used to reduce the influence of competition between adjacent plots.The Papadakis N—N analysis can be used with designs to removewithin-block variability that would not be removed with the standardsplit plot analysis (Papadakis (1973) Inst. d'Amelior. PlantesThessaloniki (Greece) Bull. Scientif. No. 23; Papadakis (1984) Proc.Acad. Athens 59: 326-342.

Experiments were performed to identify those transformants or knockoutsthat exhibited modified sugar-sensing. For such studies, seeds fromtransformants were germinated on media containing 5% glucose or 9.4%sucrose which normally partially restrict hypocotyl elongation. Plantswith altered sugar sensing may have either longer or shorter hypocotylsthan normal plants when grown on this media. Additionally, other planttraits may be varied such as root mass.

Experiments may be performed to identify those transformants orknockouts that exhibited an improved pathogen tolerance. For suchstudies, the transformants are exposed to biotropic fungal pathogens,such as Erysiphe orontii, and necrotropic fungal pathogens, such asFusarium oxysporum. Fusarium oxysporum isolates cause vascular wilts anddamping off of various annual vegetables, perennials and weeds(Mauch-Mani and Slusarenko (1994) Molec Plant-Microbe Interact. 7:378-383). For Fusarium oxysporum experiments, plants are grown on Petridishes and sprayed with a fresh spore suspension of F. oxysporum. Thespore suspension is prepared as follows: A plug of fungal hyphae from aplate culture is placed on a fresh potato dextrose agar plate andallowed to spread for one week. Five ml sterile water is then added tothe plate, swirled, and pipetted into 50 ml Armstrong Fusarium medium.Spores are grown overnight in Fusarium medium and then sprayed ontoplants using a Preval paint sprayer. Plant tissue is harvested andfrozen in liquid nitrogen 48 hours post-infection.

Erysiphe orontii is a causal agent of powdery mildew. For Erysipheorontii experiments, plants are grown approximately 4 weeks in agreenhouse under 12 hour light (20° C., −30% relative humidity (rh)).Individual leaves are infected with E. orontii spores from infectedplants using a camel's hair brush, and the plants are transferred to aPercival growth chamber (20° C., 80% rh.). Plant tissue is harvested andfrozen in liquid nitrogen 7 days post-infection.

Botrytis cinerea is a necrotrophic pathogen. Botrytis cinerea is grownon potato dextrose agar under 12 hour light (20° C., −30% relativehumidity (rh)). A spore culture is made by spreading 10 ml of sterilewater on the fungus plate, swirling and transferring spores to 10 ml ofsterile water. The spore inoculum (approx. 105 spores/ml) is then usedto spray 10 day-old seedlings grown under sterile conditions on MS(minus sucrose) media. Symptoms are evaluated every day up toapproximately 1 week.

Sclerotinia sclerotiorum hyphal cultures are grown in potato dextrosebroth. One gram of hyphae is ground, filtered, spun down and resuspendedin sterile water. A 1:10 dilution is used to spray 10 day-old seedlingsgrown aseptically under a 12 hour light/dark regime on MS (minussucrose) media. Symptoms are evaluated every day up to approximately 1week.

Pseudomonas syringae pv maculicola (Psm) strain 4326 and pv maculicolastrain 4326 was inoculated by hand at two doses. Two inoculation dosesallows the differentiation between plants with enhanced susceptibilityand plants with enhanced resistance to the pathogen. Plants are grownfor 3 weeks in the greenhouse, then transferred to the growth chamberfor the remainder of their growth. Psm ES4326 may be hand inoculatedwith 1 ml syringe on 3 fully-expanded leaves per plant (4½ wk old),using at least 9 plants per overexpressing line at two inoculationdoses, OD=0.005 and OD=0.0005. Disease scoring is performed at day 3post-inoculation with pictures of the plants and leaves taken inparallel.

In some instances, expression patterns of the pathogen-induced genes(such as defense genes) may be monitored by microarray experiments. Inthese experiments, cDNAs are generated by PCR and resuspended at a finalconcentration of ˜100 ng/μl in 3×SSC or 150 mM Na-phosphate (Eisen andBrown (1999) Methods Enzymol. 303: 179-205). The cDNAs are spotted onmicroscope glass slides coated with polylysine. The prepared cDNAs arealiquoted into 384 well plates and spotted on the slides using, forexample, an x-y-z gantry (OmniGrid) which may be purchased fromGeneMachines (Menlo Park, Calif.) outfitted with quill type pins whichmay be purchased from Telechem International (Sunnyvale, Calif.). Afterspotting, the arrays are cured for a minimum of one week at roomtemperature, rehydrated and blocked following the protocol recommendedby Eisen and Brown (1999; supra).

Sample total RNA (10 μg) samples are labeled using fluorescent Cy3 andCy5 dyes. Labeled samples are resuspended in 4×SSC/0.03% SDS/4 μg salmonsperm DNA/2 μg tRNA/50 mM Napyrophosphate, heated for 95° C. for 2.5minutes, spun down and placed on the array. The array is then coveredwith a glass coverslip and placed in a sealed chamber. The chamber isthen kept in a water bath at 62° C. overnight. The arrays are washed asdescribed in Eisen and Brown (1999, supra) and scanned on a GeneralScanning 3000 laser scanner. The resulting files are subsequentlyquantified using IMAGENE, software (BioDiscovery, Los Angeles Calif.).

RT-PCR experiments may be performed to identify those genes inducedafter exposure to biotropic fungal pathogens, such as Erysiphe orontii,necrotropic fungal pathogens, such as Fusarium oxysporum, bacteria,viruses and salicylic acid, the latter being involved in a nonspecificresistance response in Arabidopsis thaliana. Generally, the geneexpression patterns from ground plant leaf tissue is examined.

Reverse transcriptase PCR was conducted using gene specific primerswithin the coding region for each sequence identified. The primers weredesigned near the 3′ region of each DNA binding sequence initiallyidentified.

Total RNA from these ground leaf tissues was isolated using the CTABextraction protocol. Once extracted total RNA was normalized inconcentration across all the tissue types to ensure that the PCRreaction for each tissue received the same amount of cDNA template usingthe 28S band as reference. Poly(A+) RNA was purified using a modifiedprotocol from the Qiagen OLIGOTEX purification kit batch protocol. cDNAwas synthesized using standard protocols. After the first strand cDNAsynthesis, primers for Actin 2 were used to normalize the concentrationof cDNA across the tissue types. Actin 2 is found to be constitutivelyexpressed in fairly equal levels across the tissue types beinginvestigated.

For RT PCR, cDNA template was mixed with corresponding primers and TaqDNA polymerase. Each reaction consisted of 0.2 μl cDNA template, 2 μl10×Tricine buffer, 2 μl 10×Tricine buffer and 16.8 μl water, 0.05 μlPrimer 1, 0.05 μl, Primer 2, 0.3 μl Taq DNA polymerase and 8.6 μl water.

The 96 well plate is covered with microfilm and set in the thermocyclerto start the reaction cycle. By way of illustration, the reaction cyclemay comprise the following steps:

-   -   STEP 1: 93° C. FOR 3 MIN;    -   Step 2: 93° C. for 30 sec;    -   Step 3: 65° C. for 1 min;    -   Step 4: 72° C. for 2 min;    -   Steps 2, 3 and 4 are repeated for 28 cycles;    -   Step 5: 72° C. for 5 min; and    -   Step 6 4° C.

To amplify more products, for example, to identify genes that have verylow expression, additional steps may be performed: The following methodillustrates a method that may be used in this regard. The PCR plate isplaced back in the thermocycler for 8 more cycles of steps 2-4.

-   -   Step 2 93° C. for 30 sec;        Step 3 65° C. for 1 min;    -   Step 4 72° C. for 2 min, repeated for 8 cycles; and    -   Step 5 4° C.

Eight microliters of PCR product and 1.5 μl of loading dye are loaded ona 1.2% agarose gel for analysis after 28 cycles and 36 cycles.Expression levels of specific transcripts are considered low if theywere only detectable after 36 cycles of PCR. Expression levels areconsidered medium or high depending on the levels of transcript comparedwith observed transcript levels for an internal control such as actin2.Transcript levels are determined in repeat experiments and compared totranscript levels in control (e.g., non-transformed) plants.

Experiments were performed to identify those transformants or knockoutsthat exhibited an improved environmental stress tolerance. For suchstudies, the transformants were exposed to a variety of environmentalstresses. Plants were exposed to chilling stress (6 hour exposure to4-8° C.), heat stress (6 hour exposure to 32-37° C.), high salt stress(6 hour exposure to 200 mM NaCl), drought stress (168 hours afterremoving water from trays), osmotic stress (6 hour exposure to 3 Mmannitol), or nutrient limitation (nitrogen: all components of MS mediumremained constant except N was reduced to 20 mg/l of NH₄NO₃; phosphate:all components of MS medium except KH2PO₄, which was replaced by K₂SO₄;potassium: all components of MS medium except removal of KNO₃ andKH₂PO₄, which were replaced by NaH₄PO₄).

Experiments were performed to identify those transformants or knockoutsthat exhibited a modified structure and development characteristics. Forsuch studies, the transformants were observed by eye to identify novelstructural or developmental characteristics associated with the ectopicexpression of the polynucleotides or polypeptides of the invention.

Flowering time was measured by the number of rosette leaves present whena visible inflorescence of approximately 3 cm is apparent. Rosette andtotal leaf number on the progeny stem are tightly correlated with thetiming of flowering (Koornneef et al. (1991) Mol. Gen. Genet. 229:57-66). The vernalization response was also measured. For vernalizationtreatments, seeds were sown to MS agar plates, sealed with microporetape, and placed in a 4° C. cold room with low light levels for 6-8weeks. The plates were then transferred to the growth rooms alongsideplates containing freshly sown non-vernalized controls. Rosette leaveswere counted when a visible inflorescence of approximately 3 cm wasapparent.

Modified phenotypes observed for particular overexpressor or knockoutplants are provided in Table 4. For a particular overexpressor thatshows a less beneficial characteristic, it may be more useful to selecta plant with a decreased expression of the particular transcriptionfactor. For a particular knockout that shows a less beneficialcharacteristic, it may be more useful to select a plant with anincreased expression of the particular transcription factor.

The sequences of the Sequence Listing or those in Tables 4-8, or thosedisclosed here, can be used to prepare transgenic plants and plants withaltered traits. The specific transgenic plants listed below are producedfrom the sequences of the Sequence Listing, as noted. Table 4 providesexemplary polynucleotide and polypeptide sequences of the invention.

Example VIII Examples of Genes that Confer Significant Improvements toPlants

A number of genes and homologs that confer significant improvements toknockout or overexpressing plants were noted below. Experimentalobservations made with regard to specific genes whose expression wasmodified in overexpressing or knockout plants, and potentialapplications based on these observations, were also presented.

G8 (SEQ ID NO: 11)

Published Information

G8 corresponds to gene At2g28550 (AAD21489). The gene has also beendescribed as RAP2.7 (Okamuro et al. (1997) Proc. Natl. Acad. Sci. U.S.A.94: 7076-7081). No functional information is available about G8.

Experimental Observations

The function of G8 was studied using transgenic plants in which the genewas expressed under the control of the 35S promoter. Overexpression ofG8 caused alterations in plant development, the most consistent onebeing a delay in flowering time.

This phenotype was observed in approximately 25% of the primarytransformants. These individuals showed a relatively strong phenotypeand typically made 30-50 leaves (versus 10-12 the wild-type controls)prior to bolting under 24-hour light. This phenotype was reproduced insome, but not all, of the T2 progeny plants from each one of the lines.Additionally, a further T2 population was found to flower later thanwild type in 12-hour light conditions. Thus, late flowering was observedin both the T1 and T2 generations, and in different photoperiodicconditions.

It should also be noted that many 35S::G8 plants appeared smaller thancontrols, particularly at early stages. Accordingly, in the T2 linesused for physiological analyses it was observed that seedlings weresmaller and showed reduced vigor when germinated on MS plates. However,not all 35S::G8 lines showed these effects.

G8 was ubiquitously expressed, at higher levels in rosette leaves, anddid not appear to be induced by any of the conditions tested.

Utilities

G8 could potentially be used to alter flowering time.

G19 (SEQ ID NO: 21)

Published Information

G19 belongs to the EREBP subfamily of transcription factors, i.e., itcontains only one AP2 domain. G19 corresponds to the previouslydescribed gene RAP2.3 (Okamuro et al. (1997) Proc. Natl. Acad. Sci.U.S.A. 94: 7076-7081). Close inspection of the Arabidopsis cDNAsequences of RAP2.3 (AF003096; Okamuro et al. (1997) supra), AtEBP(Y09942; Buttner and Singh (1997) Proc. Natl. Acad. Sci. U.S.A 94:5961-5966), and ATCADINP (Z37504) suggests that they may correspond tothe same gene (Riechmann and Meyerowitz, (1998) Biol. Chem. 379:633-646). G19/RAP2.3 is ubiquitously expressed (Okamuro et al. (1997)supra). AtEBP was isolated by virtue of the protein-protein interactionbetween AtEBP and OBF4, a basic-region leucine zipper transcriptionfactor (Buttner and Singh (1997) supra). AtEBP expression levels inseedlings were increased after treatment with ethylene (ethephon)(Buttner and Singh (1997) supra). AtEBP was found to bind to GCC-boxcontaining sequences, like that of the PRB-1b promoter (Buttner andSingh (1997) supra). It has been suggested that the interaction betweenAtEBP and OBF4 reflects cross-coupling between EREBP and bZIPtranscription factors which might be important in regulating geneexpression during the plant defense response (Buttner and Singh (1997)supra).

Experimental Observations

Transgenic plants in which G19 was expressed under the control of the35S promoter were morphologically similar to control plants. G19 wasconstitutively expressed in the different tissues examined; however G19expression was significantly repressed by methyl jasmonate (MeJ) andinduced by ACC (this latter result correlates with the previouslydescribed increase in G19 expression levels in seedlings after treatmentwith ethylene (ethephon); Buttner and Singh (1997) supra). G19 wassignificantly induced upon infection by the fungal pathogen Erysipheorontii. In addition, G19 overexpressing plants were more tolerant toinfection with a moderate dose of Erysiphe orontii.

Both the jasmonic acid and the ethylene signal transduction pathways areinvolved in the regulation of the defense response and the woundresponse, and the two pathways have been found to interactsynergistically. The regulation of G19 expression by both hormones, itsinduction upon Erysiphe orontii infection, as well as the preliminarydata indicating that increased tolerance to that pathogen is conferredby G19 overexpression, suggested that G19 might play a role in thecontrol of the defense and/or wound response.

Utilities

G19 can be used to manipulate the plant defense wound orinsect-response, as well as the jasmonic acid and ethylene signaltransduction pathways themselves.

G22 (SEQ ID NO: 27)

Published Information

G22 has been identified in the sequence of BAC T13E15 (gene T13E15.5) byThe Institute of Genomic Research (TIGR) as a “TINY transcription factorisolog”. G22 belongs to the EREBP subfamily, i.e., it contains only oneAP2 domain, and phylogenetic analyses place G22 relatively close toother EREBP subfamily genes, like TINY and ATDL4400C (Riechmann andMeyerowitz (1998) Biol. Chem. 379: 633-646). No functional informationis available about G22.

Experimental Observations

G22 was constitutively expressed at medium levels. There appeared to beno phenotypic alteration on plant morphology upon G22 overexpression.Plants ectopically overexpressing G22 were more tolerant to highNaCl-containing media in a root growth assay compared to wild-typecontrols.

Utilities

G22 could be used to increase plant tolerance to soil salinity duringgermination, at the seedling stage, or throughout the plant life cycle.G24 (SEQ ID NO: 29)

Published Information

G24 corresponds to gene At2g23340 (AAB87098). No information isavailable about the function(s) of G24.

Closely Related Genes from Other Species

G24 is highly related to a Descurainia sophia AP2/EREBP gene representedby cDNA clone: BG321374 (BG321374 Ds01_(—)6d08_RDs01_AAFC_ECORC_cold_stressed_Flixweed_seedlings Descurainia sophia cDNAclone Ds01_(—)06d08, mRNA sequence).

Experimental Observations

The function of G24 was studied using transgenic plants in which thegene was expressed under the control of the 35S promoter. Overexpressionof G24 caused alterations in plant growth and development. Most notably,35S::G24 seedlings often developed black necrotic tissue patches oncotyledons and leaves, and many died at that stage. Some 35S::G24seedlings exhibited a weaker phenotype, and although necrotic patcheswere visible on the cotyledons, they did not die. These seedlingsdeveloped into plants that were usually small, slow growing, and poorlyfertile in comparison to wild type controls. The leaves of older35S::G24 plants were also observed to become yellow and senesceprematurely compared to wild type. For those lines that could be assayedin biochemical and physiological assays, no differences were observedwith respect to wild type controls.

G24 is ubiquitously expressed, at apparently lower levels in germinatingseedlings, and is not significantly induced by any of the conditionstested.

The AP2 domain of G24 is nearly identical to that of other ArabidopsisEREBP proteins, such as G12, G1379, and G1277. Whether all theseproteins share related functions remains to be determined.

Utilities

G24 or its equivalogs can be used to trigger cell death and influence orcontrol processes in which cell death plays a role. G24 can be used toblock pathogen infection by triggering it in infected cells and blockingspread of the disease.

G28 (SEQ ID NO: 37)

Published Information

G28 corresponds to AtERF1 (GenBank accession number AB008103) (Fujimotoet al. (2000) Plant Cell 12: 393-404). G28 appears as gene AT4g17500 inthe annotated sequence of Arabidopsis chromosome 4 (AL161546.2).

AtERF1 has been shown to have GCC-box binding activity [somedefense-related genes that were induced by ethylene were found tocontain a short cis-acting element known as the GCC-box: AGCCGCC (Ohmeet al. (1990) Plant Mol. Biol. 15: 941-946)]. Using transient assays inArabidopsis leaves, AtERF1 was found to be able to act as a GCC-boxsequence specific transactivator (Fujimoto et al. (2000) supra).

AtERF1 expression has been described to be induced by ethylene (two- tothree-fold increase in AtERF1 transcript levels 12 h after ethylenetreatment) (Fujimoto et al. (2000) supra). In the ein2 mutant, theexpression of AtERF1 was not induced by ethylene, suggesting that theethylene induction of AtERF1 is regulated under the ethylene signalingpathway (Fujimoto et al. (2000) supra). AtERF1 expression was alsoinduced by wounding, but not by other abiotic stresses (such as cold,salinity, or drought) (Fujimoto et al. (2000) supra).

It has been suggested that AtERFs, in general, may act as transcriptionfactors for stress-responsive genes, and that the GCC-box may act as acis-regulatory element for biotic and abiotic stress signal transductionin addition to its role as an ethylene responsive element (ERE)(Fujimoto et al. (2000) supra), but there is no data available on thephysiological functions of AtERF1.

Experimental Observations

The function of G28 was analyzed using transgenic plants in which thisgene was expressed under the control of the 35S promoter. G28overexpressing lines were more tolerant to infection with a moderatedose of the fungal pathogen Erysiphe orontii. G28 overexpression did notseem to have detrimental effects on plant growth or vigor, since plantsfrom most of the lines were morphologically wild-type. In addition, nodifference was detected between those lines and the correspondingwild-type controls in all the biochemical assays that were performed.G28 was ubiquitously expressed.

G28 overexpressing lines were also more tolerant to Sclerotiniasclerotiorum and Botrytis cinerea. In a repeat experiment usingindividual lines, all three lines analyzed showed tolerance to S.sclerotiorum, and two of the three lines tested were more tolerant to Bcinerea.

Utilities

G28 transgenic plants had an altered response to fungal pathogens, inthat those plants were more tolerant to the pathogens. Therefore, G28 orits equivalogs can be used to manipulate the defense response in orderto generate pathogen-resistant plants.

G46 (SEQ ID NO: 53)

Published Information

G46 was first identified in the sequence of P1 clone MBK20 (GenBankaccession number AB010070, gene MBK20. 1). No information is availableabout the function(s) of G46.

Experimental Observations

RT-PCR experiments revealed that G46 is ubiquitously expressed, but isapparently induced by stress conditions such as auxin, heat, salt andErysiphe.

The function of G46 was first studied by analyzing knockout mutants witha line homozygous for a T-DNA insertion in the gene. G46 knockout mutantplants were indistinguishable from wild-type in all assays performed.

The function of G46 was also analyzed using transgenic plants in which acDNA clone of the gene was expressed under the control of the 35Spromoter. A small number of lines were larger than wild-type plants,developed more rapidly, and yielded an increased quantity of seedcompared to wild-type controls.

In the physiological analysis, all three 35S::G46 lines tested showedmore resistance to severe water deprivation stress. Seedlings aregenerally larger and greener than the control plants exposed to the sameconditions.

Utilities

The increased size and growth rate seen in some of the lines, indicatesthat the gene could be used to increase crop productivity.

The reduced sensitivity of 35S::G46 lines in the dehydration stressassay indicates that the gene may also be used to engineer crops withincreased tolerance to drought, salt, freezing and/or chilling stress,or increased water use efficiency.

G153 (SEQ ID NO: 65)

Published Information

G153 corresponds to the Arabidopsis ANR1 gene. This locus was identifiedby Zhang and Forde ((1998) Science 279: 407-409) as a MADS box gene thatis rapidly induced in the roots of nitrogen starved seedlings, followingexposure to a nitrate source. Additionally, it was shown that transgeniclines in which an antisense clone of ANR1 is overexpressed, show analtered sensitivity to nitrate and, unlike wild-type plants, do notexhibit lateral root proliferation in response to nitrate treatments.From these data, it was concluded that ANR1 is a key regulator ofnutrient-induced changes in root architecture (Zhang and Forde (1998)supra).

However, Wang et al. ((2000) Plant Cell 12: 1491-1509) have publisheddata which contradicts the results of Zhang and Forde. These authorsfound that ANR1 is actually repressed, rather than induced, followingtreatment of nitrogen starved seedlings (grown on 10 mM ammoniumsuccinate as the sole nitrogen source) with 5 mM nitrate.

A phylogenetic analysis of the Arabidopsis MADS box gene family situatedANR1 in same clade as three other MADS box genes: AGL16 (G860), AGL17(G152) and AGL21 (61760) (Alvarez-Buylla et al. (2000) Proc Natl AcadSci USA. 97: 5328-5333). Two of the genes, AGL17 and AGL21 were recentlyshown to be expressed in specific zones of the root, suggesting thatdifferent members of the ANR1 clade may play distinct regulatory rolesduring root development (Burgeff et al. (2002) Planta 214: 365-372).

The ANR1 sequence (GenBank accession AX507709) has also been included ina patent publication (WO0216655) as a stress-regulated plant.Experimental Observations RT-PCR experiments revealed that G153 wasup-regulated in leaves in response to heat and Fusarium treatments.Lower levels of induction were also observed following auxin, ABA, andcold treatments, indicating that G153 might have a role in a variety ofstress responses.

To further assess the function of the gene, 35S::G153 lines weregenerated and subjected them to various assays. Around a third of thelines showed a marked acceleration in the onset of flowering, suggestingthat the gene might impinge on genetic pathways that regulate floweringtime. In addition to the effects on flowering, 35S::G153 lines displayedan enhanced performance in an assay intended to reveal alterations inC:N sensing. 35S::G153 seedlings contained less anthocyanins (and insome cases were larger) than wild-type controls grown on high sucrose/N—plates. Seedlings were also larger and greener on high sucrose/N— platesthat had been supplemented with glutamine. Together, these dataindicated that overexpression of G153 alters the ability to modulatecarbon and/or nitrogen uptake and utilization.

A closely related gene, G1760 (SEQ ID NO: 937), was analyzed and like35S::G153 transformants, 35S::G1760 lines also exhibited early floweringand RT-PCR studies showed G1760 to be predominantly expressed in rootsand to be stress responsive. Thus, G1760 and G153 likely have similarand/or overlapping functions.

Utilities

The response of G153 expression to different physiological treatmentsindicated that the gene or its equivalogs could be used to improveresistance to a variety of different stresses. In particular, theenhanced performance of 35S::G153 lines under low nitrogen conditionsindicated that G153 might be used to engineer crops that could thrive inenvironments with reduced nitrogen availability.

Given the early flowering seen in the 35S::G153 transformants, the geneor its equivalogs might also be applied to manipulate the flowering timeof commercial species. In particular, G153 could be used to accelerateflowering, or eliminate any requirement for vernalization. Conversely,it might be possible to modify the activity of G1153 or its equivalogsto delay flowering in order to achieve an increase in biomass and yield.

G156 (SEQ ID NO: 67)

Published Information

G156 corresponds to gene MKD15.12 (GenBank accession number BAB11181.1).G156 has also been described as AGL32 (Alvarez-Buylla et al. (2000)Proc. Natl. Acad. Sci. 97:5328-5333). Phylogenetic analyses of theArabidopsis MADS box gene family indicate that G156/AGL32 is a Type IIMADS-box gene, but it does not belong to any of the well-characterizedType II MADS gene clades (Alvarez-Buylla et al. 2000 supra).

Experimental Observations

The complete cDNA sequence of G156 was determined. The function of thisgene was analyzed using both transgenic plants in which G1156 wasexpressed under the control of the 35S promoter and a line homozygousfor a T-DNA insertion in the gene. The T-DNA insertion lies in thesecond intron, and was expected to result in a strong loss-of-functionor null mutation.

G156 knockout mutant plants produced yellow seed that showed morevariation in shape than wild type, implying a function (direct orindirect) for G156 in seed development. G156 mutant plants wereotherwise normal at all other developmental stages. Expression of G156was determined to be specific to floral tissues. Although expression wasdetected by RT-PCR in flowers, siliques, and embryos, it could well bethat G156 was specifically expressed in embryo/seed during development,in light of the many MADS box genes that have been shown to be expressedin specific floral organs or cell types, and of the G156 knockout mutantphenotype. In situ RNA hybridization experiments will determine moreprecisely G156 expression pattern.

The coloration phenotype of the G156 knockout mutant seed resembles thatof ttg1 and the transparent testa mutants. TTG1, which is localized inChromosome 5, but approximately 0.5 Mb away from the clone that containsG156 (MKD15), codes for a WD40 repeat protein (Walker et al. (1999)Plant Cell 11: 1337-1350). The transparent testa (tt) loci wereidentified in screens for mutations that result in yellow or pale brownseeds (Koornneef (1990) Arabidopsis Inf. Ser. 27:1-4). Many of the “TT”genes have been mapped, and several of them have been cloned and shownto be involved in the anthocyanin pathway (Debeaujon et al. (2001) PlantCell 13:853-872)

None of the TT genes corresponds to G156. TT3, TT4, TT5, and TT7 codefor dihydroflavol 4reductase, chalcone synthase, chalcone flavanoneisomerase, and flavonoid 3′-hydroxylase, respectively (Shirley et al.(1992) Plant Cell 4:333-347; Shirley et al. (1995) Plant J. 8:659-671).TT12 encodes a multidrug secondary transporter-like protein required forflavonoid sequestration in vacuoles of the seed coat endothelium(Debeaujon et al. (2001) supra). TT6 and TT9 map on Chromosome 3, andTT1 maps on Chromosome 1. TT2 and TT10 map on Chromosome 5, but far awayfrom the position of G156 (Shirley et al. (1995) supra). TT8 has alsobeen cloned and shown to encode a transcription factor of the basichelix-loop-helix class (Nesi et al. (2000) Plant Cell 12:1863-1878),providing further evidence for the regulation of the anthocyanin pathwayat the transcriptional level.

The similarity of the G156 knockout and tt seed coloration phenotypes,and the involvement of at least some of the TT genes in the anthocyaninpathway, suggested that G156 is involved in its regulation.

In addition to the seed coloration phenotype, the G156 knockout mutantshowed a significant increase in the percentage of seed 18:1 fattyacids.

G156 overexpressing plants showed a variety of morphologicalalterations, largely uninformative. The most severely affectedtransformants were extremely dwarfed, had aberrant branching, andsometimes possessed terminal flowers. These phenotypic alterations werefrequently observed when MADS box genes that were involved in flowerdevelopment were overexpressed in Arabidopsis (for instance, AG, AP1,and AP3+PI; Mizukami et al. (1992) Cell 71:119-131; Mandel et al. (1995)Nature 377:522-524; Krizek et al. (1996) Development 122:11-22).

Both G 156 knockout mutant plants and G 156 overexpressing lines behavedlike the wild-type controls in the physiological assays performed.

Utilities

G156 or its equivalogs can be used to manipulate the anthocyaninbiosynthetic pathway, such as for altering seed coloration. In addition,the promoter of G156 may be used to confer seed-specific expression togenes of interest.

G157 (SEQ ID NO: 69)

Published Information

G157 was first identified in the sequence of BAC F22K20 (GenBankaccession number AC002291; gene F22K20.15).

Experimental Observations

G1157 was recognized as a gene highly related to Arabidopsis FLOWERINGLOCUS C (FLC; Michaels et al. (1999) Plant Cell 11:949-956; Sheldon etal. (1999) Plant Cell 11:445-458). FLC acts as a repressor of flowering.Late flowering vernalization responsive ecotypes and mutants have highsteady state levels of FLC transcript, which decrease during thepromotion of flowering by vernalization. FLC therefore has a centralrole in regulating the response to vernalization (Michaels (1999) supra;Sheldon et al. (1999) supra; Sheldon et al. (2000) Proc. Natl. Acad.Sci. 97:3753-3758).

The function of G157 was studied using transgenic plants in which thisgene was expressed under the control of the 35S promoter.Over-expression of G157 modifies flowering time, and it appears to do soin a quantitative manner: a modest level of over-expression triggersearly flowering, whereas a larger increase delays flowering. G157over-expression promoted flowering in the Arabidopsis late-floweringvernalization-dependent ecotypes Stockholm and Pitztal.

In contrast to FLC, G157 transcript levels showed no correlation withthe vernalization response, and over-expression of G157 did notinfluence FLC transcript levels. Thus, G157 likely acts downstream orindependently of FLC transcription. In addition, a cluster of fouradditional FLC-like and G157-like genes were identified, raising thepossibility that a whole sub-group of proteins within the MADS familyregulates flowering time.

G157 overexpressing plants did not show any other morphological,physiological, or biochemical alteration in the assays that wereperformed. Overexpression of G157 was not observed to have deleteriouseffects: 35S::G157 plants were healthy and attained a wild-type staturewhen mature.

For many crops, high yielding winter strains can only be grown inregions where the growing season is sufficiently cold and prolonged toelicit vernalization. A system that could trigger flowering at highertemperatures would greatly expand the acreage over which wintervarieties can be cultivated. The finding that G1157 overexpressioncaused early flowering in Arabidopsis Stockholm and Pitztal plants,indicated that the gene can overcome the high level of FRIGIDA and FLCactivity present in those lateecotypes. That the effects were similar tothose caused by vernalization implied that G157 might be applicable towinter strains of crop species. To date, a substantial number of geneshave been found to promote flowering. Many, however, including thoseencoding the transcription factors, APETALA1, LEAFY, and CONSTANS,produce extreme dwarfing and/or shoot termination when over-expressed.Overexpression of G157 was not observed to have deleterious effects.35S::G157 Arabidopsis plants were healthy and attained a wild-typestature when mature. Irrespective of the mode of G157 action, andwhether its true biological role is as an activator or a repressor offlowering, the results suggested that G157 may produce either early orlate flowering, according to the level of over-expression.

G162 (SEQ ID NO: 71)

Published Information

G162 corresponds to gene At2g34440 (AAC26702), and it has also beenreferred to as AGL29.

Experimental Observations

The function of G162 was studied using transgenic plants in which thegene was expressed under the control of the 35S promoter. 35S::G162plants were wild-type in morphology and development. Overexpression ofG1162 resulted in a significant increase in oil content in seeds, asmeasured by NIR.

Utilities

G162 or its equivalogs may be used to increase seed oil content andmanipulate seed protein content in crop plants.

G180 (SEQ ID NO: 87)

Published Information

G180 was identified in the sequence of BAC F16B22 (GenBank accessionnumber AC003672).

Experimental Observations

The complete sequence of G180 was determined. G180 was not annotated inthe sequence of Arabidopsis thaliana chromosome II section 239 of 255 ofthe complete sequence (AC003672.2), where it resides between At2g44740and At2g44750.

The function of G180 was analyzed using transgenic plants in which thisgene was expressed under the control of the 35S promoter.

G1180 overexpressing plants were early flowering, but did not exhibitother major developmental alterations. A number of Arabidopsis geneshave already been shown to accelerate flowering when constitutivelyexpressed. These include LEAFY, APETALA1 and CONSTANS. In these cases,however, the early flowering plants showed undesirable side effects suchas extreme dwarfing, infertility, or premature termination of shootmeristem growth (Mandel et al. (1995) Nature 377:522-524; Weigel et al.(1995) 377: 495-500; Simon et al. (1996) Nature 384:59-62). It appearedthat G180 induced flowering without these toxic pleiotropic effects.

G180 overexpressing lines also showed a decrease in seed oil content.That decrease was accompanied increased seed protein content in one ofthe three lines analyzed.

Utilities

G180 overexpression appeared to alter flowering time by accelerating thetransition from vegetative to reproductive state. Therefore, G180 or itsequivalogs may be used to manipulate flowering time in plants. Inaddition, G180 or its equivalogs can also have utility in modifying seedtraits, particularly in modifying seed oil and protein levels in cropplants.

G188 (SEQ ID NO: 97)

Published Information

G188 corresponds to gene MXC20.3, first identified in the sequence ofclone MXC20 (released by the Arabidopsis Genome Initiative; GenBankaccession number AB009055). No published information is available aboutthe function(s) of G188.

Experimental Observations

The annotation of G188 in BAC AB009055 was experimentally confirmed.G188 appeared to be expressed in all tissues and under all conditionsexamined.

A line homozygous for a T-DNA insertion in G188 was initially used tocharacterize the function of this gene. The T-DNA insertion in G188 waslocalized in the second intron of the gene, which is located in themiddle of the conserved WRKY box. This insertion resulted in a nullmutation. G1188 mutant plants displayed several phenotypic alterationsin physiological assays. G1188 knockout mutant seed germinated slightlybetter than wild-type controls under several kinds of osmotic stress.G1188 knockout plants also showed higher susceptibility to thenecrotroph fungal pathogen Fusarium oxysporum compared to controlplants; more disease spread after infection. No significantmorphological changes were observed in G188 knockout plants.

Utilities

G188 or its equivalogs can be used to enhance seed germination underadverse osmotic conditions. G188 or equivalogs may also be used tomanipulate a plant's response to Fusarium oxysporum, and perhaps otherpathogens.

G192 (SEQ ID NO: 101)

Published Information

G192 corresponds to gene A_IG002N01.6, first identified in the sequenceof BAC clone A_IG002NO1 (released by the Arabidopsis Genome Initiative;GenBank accession number AF007269).

Experimental Observations

The annotation of G192 in BAC AF007269 was experimentally confirmed.G192 was expressed in all plant tissues and under all conditionsexamined. Its expression was induced upon infection by Fusarium.

The function of G192 was analyzed using transgenic plants in which thisgene was expressed under the control of the 35S promoter. G192overexpressors were late flowering under 12 hour light and had moreleaves than control plants. This phenotype was manifested in the threeT2 lines analyzed. In addition, one line showed a decrease in seed oilcontent. No other differences between G192 overexpressing lines andcontrol plants were noted in the assays performed.

A decrease in seed oil observed previously in one transgenic line wasreplicated in an independent experiment.

Utilities

G192 overexpression delayed flowering. A wide variety of applicationsexist for genes or their equivalogs that either lengthen or shorten thetime to flowering, or for systems of inducible flowering time control.In particular, in species where the vegetative parts of the plantsconstitute the crop and the reproductive tissues were discarded, itwould be advantageous to delay or prevent flowering. Extendingvegetative development may bring about large increases in yields.

G192 or its equivalogs can be used to manipulate seed oil content, whichmight be of nutritional value.

G196 (SEQ ID NO: 105)

Published Information

G196 corresponds to gene At2g34830 (AAC12823).

Experimental Observations

The function of G196 was studied using transgenic plants in which thegene was expressed under the control of the 35S promoter. 35S::G196plants show more tolerance to salt stress in a germination assay.Overexpression of G196 also produced a range of effects on plantmorphology including a reduction in overall size, lowered fertility andchanges in leaf shape. T1 seedlings were typically small, often hadabnormal shaped cotyledons, and the rosette leaves produced by theseplants were often undersized, contorted and darker green compared withwild type. Later in development, during the reproductive stage, theplants formed thin inflorescences bearing poorly fertile flowers withunderdeveloped organs. 35S::G196 primary transformants were obtained ata relatively low frequency, suggesting that the gene might have lethaleffects if overexpressed at very high levels.

35S::G196 plants were wild-type in the biochemical analyses that wereperformed. G196 was ubiquitously expressed (and different levels amongthe various tissues).

Utilities

G1196 or its equivalogs may be used to improve plant performance underconditions of salt stress. Evaporation from the soil surface causesupward water movement and salt accumulation in the upper soil layerwhere the seeds were placed. Thus, germination normally takes place at asalt concentration that is higher than the mean salt concentration inthe whole soil profile. Increased salt tolerance during the germinationstage of a crop plant may impact survivability and yield.

G211 (SEQ ID NO: 121)

Published Information

G211 corresponds to AtmybS (U26935; Li et al. (1996) FEBS Lett379:117-121). Arabidopsis plants transgenic for a chimeric AtmybSpromoter/GUS gene expressed the enzyme in developing leaf trichomes,stipules, epidermal cells on the margins of young rosette and caulineleaves, and in immature seeds. In immature seeds, Atmyb5 expressionoccurs between fertilization and the 16 cell stage of embryo developmentand persists beyond the heart stage.

Experimental Observations

The function of G211 was investigated using a homozygous mutant line inwhich a T-DNA was inserted into the coding region of the gene as well asusing transgenic lines in which G211 is expressed under the control ofthe 35S promoter. The phenotype of the G211 knockout mutant plants waswild-type in all respects. Overexpression of G211, however, had markedeffects on leaf and inflorescence development. 35S::G211 plants weregenerally small, slow developing, and produced rounded, slightlyserrated leaves, with very short petioles. Additionally these plantswere dark green in coloration, and in some cases, appeared to havereduced trichome density. Following the switch to reproductive growth,35S::G211 inflorescences had short internodes and showed a generalreduction in apical dominance, leading to a bushy appearance. In manycases, due to the small size, seed yield was reduced compared withwild-type controls. These effects were highly penetrant and wereapparent in the majority of T I lines and, to some extent, in each ofthe three T2 populations. An increase in leaf xylose in two lines wasalso observed in the T2 35S::G211 transgenics.

As determined by RT-PCR, expression of G211 was found primarily inembryos and siliques. G211 expression in leaf tissue was unaffected byany environmental stress-related condition tested.

Utilities

G211 overexpression resulted in plants with altered leaf insoluble sugarcontent. Transcription factors such as G211 or their equivalogs thatalter plant cell wall composition have several potential applicationsincluding altering food digestibility, plant tensile strength, woodquality, pathogen resistance and in pulp production.

In particular, hemicellulose is not desirable in paper pulps because ofits lack of strength compared with cellulose. Thus, modulating theamounts of cellulose vs. hemicellulose in the plant cell wall isdesirable for the paper/lumber industry. Increasing the insolublecarbohydrate content in various fruits, vegetables, and other edibleconsumer products will result in enhanced fiber content. Increased fibercontent would not only provide health benefits in food products, butmight also increase digestibility of forage crops. In addition, thehemicellulose and pectin content of fruits and berries affects thequality of jam and catsup made from them. Changes in hemicellulose andpectin content could result in a superior consumer product.

G214 (SEQ ID NO: 127)

Published Information

G214 (CCA1) was published by Wang et al. (1997) Plant Cell 9: 491-507.CCA1 is involved in phytochrome induction of CAB genes. The transcriptis transiently induced by phytochrome and oscillates with a circadianrhythm. It feedback-regulates its own expression at the transcriptionallevel. Overexpressing CCA1 abolished circadian rhythm of several genesand results in plants that were late flowering, and have elongatedhypocotyls.

Experimental Observations

G214 overexpressing lines were late bolting, show larger biomass(increased leaf number and size), and were darker green in vegetativeand reproductive tissues due to a higher chlorophyll content in thelater stages of development. In these later stages, the overexpressorsalso have higher insoluble sugar, leaf fatty acid, and carotenoidcontent per unit area. One line also showed a significant, repeatableincrease in lutein levels in seeds. Microarray data was consistent withthe morphological and biochemical data in that the genes that werehighly induced included chloroplast localized enzymes, and lightregulated genes such as Rubisco, carbonic anhydrase, and the photosystem1 reaction center subunit precursor. A chlorophyll biosynthetic enzymewas also highly induced, consistent with the dark green color of theadult leaves and perhaps a higher photosynthetic rate. A measurement ofleaf fatty acid in the older overexpressors suggested that the overalllevels were higher than wild-type levels (except for the percentcomposition of 16:3 in one line). Percent composition of 16:1 and 16:3fatty acids (found primarily in plastids) is similar to wild typearguing against an increase in chloroplast number as an explanation forincrease chlorophyll content in the leaves. Three G214-overexpressinglines were sensitive to germination on high glucose showing lesscotyledon expansion and hypocotyl elongation suggesting the late boltingand dark green phenotype could be tied into carbon sensing which hasbeen shown to regulate phytochrome A signaling (Dijkwel et al. (1997)Plant Cell 9:583-595; Van Oosten et al. (1997) Plant J. 12:1011-1020).Sugars are key regulatory molecules that affect diverse processes inhigher plants including germination, growth, flowering, senescence,sugar metabolism and photosynthesis. Glucose-specific hexose-sensing hasalso been described in plants and implicated in cell division and therepression of famine genes (photosynthetic or glyoxylate cycles).

Utilities

Potential utilities of this gene or its equivalogs include increasingchlorophyll content allowing more growth and productivity in conditionsof low light. With a potentially higher photosynthetic rate, fruits canhave higher sugar content. Increased carotenoid content may be used as anutraceutical to produce foods with greater antioxidant capability. G214or its equivalogs can also be used to manipulate seed composition, whichis very important for the nutritional value and production of variousfood products.

G214 overexpression delayed flowering time in transgenic plants, andthus this gene or its equivalogs would be useful in modifying floweringtime. In a sizeable number of species, for example, root crops, wherethe vegetative parts of the plants constitute the crop and thereproductive tissues were discarded, it is advantageous to identify andincorporate transcription factor genes that delay or prevent floweringin order to prevent resources being diverted into reproductivedevelopment. Extending vegetative development can thus bring about largeincreases in yields.

G225 (SEQ ID NO: 139)

Published Information

G225 is equivalent to the Arabidopsis gene CPC, or CAPRICE (Wada et al.(1997) Science 277:1113-1116; U.S. Pat. No. 5,831,060). G225 or CAPRICEis involved in epidermal cell differentiation. Mutations in the generesult in plants with very few root hairs and the overexpression of thegene causes an increase in the number of root hairs and a neartrichome-less leaf phenotype (Wada, (1997) supra).

Experimental Observations

The function of G225 was analyzed through its ectopic overexpression inplants. G225 overexpressors showed more root growth and were larger thanwild-type controls on nitrogen-limiting media. In addition, theseedlings lacked anthocyanin production in response to several stresstreatments. G225 overexpressors were glabrous and produced ectopic roothairs. The overexpressors also had more root hairs than wild-typecontrols on MS media (without treatment), but under conditions of lownitrogen these overexpressors produced even more root hairs. Inaddition, G225 overexpressors germinated better at 32° C. under heatstress. It is possible that better germination in the heat could berelated to tolerance to water deficiency or drought tolerance. Thenitrogen and heat tolerant phenotypes have not been reported for thisgene.

Consistent with the tolerance to low nitrogen, this line showed anammonium transporter induced 3.3-fold as well as a nitrate transporter(CHL1) 2.2-fold over wild-type. It is possible that the greater numberof root hairs could account for the increase in transcript levels ofthese two genes and could also account for the phenotype we observe. Thenitrate transporter is localized to root hairs (Huang et al. (1999)Plant Cell 11: 1381-1392.

Utilities

G225 may be used to produce plants that are more tolerant to conditionsof low nitrogen and heat.

G226 (SEQ ID NO: 141)

Published Information

G226 was identified from the Arabidopsis BAC sequence, AC002338, basedon its sequence similarity within the conserved domain to other Mybfamily members in Arabidopsis. To date, there is no publishedinformation regarding the function of this gene.

Experimental Observations

The function of G226 was analyzed through its ectopic overexpression inplants. G226 overexpressors were more tolerant to low nitrogen and highsalt stress. They showed more root growth and possibly more root hairsunder conditions of nitrogen limitation compared with wild-typecontrols. Many plants were glabrous and lacked anthocyanin productionwhen under stress such as growth conditions of low nitrogen and highsalt. Several G226 overexpressors were glabrous and produce lessanthocyanin under stress; these effects might be due to binding sitecompetition with other Myb family transcription factors involved inthese functions and not directly related to the primary function of thisgene.

Results from the biochemical analysis of G226 overexpressors suggestedthat one line had higher amounts of seed protein, which could have beena result of increased nitrogen uptake by these plants.

A microarray experiment was done on a separate G226 overexpressing line.The G226 sequence itself was overexpressed 16-fold above wild type,however, very few changes in other gene expression were observed in thisline. On the array, a chlorate/nitrate transporter DNA sequence wasinduced 2.7fold over wild type, which could explain the low nitrogentolerant phenotype of the plants and the increased amounts of seedprotein in one of the lines. The same DNA sequence was present severaltimes on the array and in all cases the DNA sequence showed induction,adding more validity to the data. Five other genes/DNA sequences inducedbut had unknown function. A methyltransferase, a pollen-specificprotein, and a zinc binding peroxisomal membrane protein encodingsequences were also induced, however their role in regard to thephenotype of the plants is not known.

Utilities

The utilities of a gene or its equivalogs conferring tolerance toconditions of low nitrogen include: (1) Cost savings to the farmer byreducing the amounts of fertilizer needed; (2) Environmental benefits ofreduced fertilizer runoff; (3) Improved yield and stress tolerance. Inaddition, G226 can be used to increase seed protein amounts and/orcomposition, which may impact yield as well as the nutritional value andproduction of various food products.

G226 or its equivalogs can be used to alter trichome number anddistribution in plants. Trichome glands on the surface of many higherplants produce and secrete exudates, which give protection from theelements and pests such as insects, microbes and herbivores. Theseexudates may physically immobilize insects and spores, may beinsecticidal or antimicrobial or they may allergens or irritants toprotect against herbivores. It has also been suggested that trichomesmay decrease transpiration by decreasing leaf surface airflow, and byexuding chemicals that protect the leaf from the sun.

G241 (SEQ ID NO: 163)

Published Information

G241 is equivalent to Y19 (X90384), a putative light regulated Myb thatwas identified by Quaedvlieg et al. (1996) Plant Mol. Biol.32:987-993.The Myb Consortium renamed this gene MYB15 and found that it wasconstitutively expressed at a low level with expression higher inetiolated seedlings (Kranz et al. (1998) Plant J. 16:263-276).

Experimental Observations

The function of G241 was analyzed through its ectopic overexpression inplants as well as through the analysis of a line homozygous for aknockout mutation in G241. The knockout mutant plants were wild-type inall assays performed. G241 overexpressors had a glucose germinationphenotype suggesting these plants could be involved in glucose-specificsugar sensing.

Results from the biochemical analysis of G241 knockouts showed that alower amount of seed oil and an increase in seed protein.

RT-PCR analysis of the endogenous levels of G241 showed the gene isexpressed in all tissue types tested.

Results from an array experiment using a G241 overexpressor line wereconsistent with expression in seeds. Several gene sequences were inducedthat could be involved in osmotic stress tolerance or desiccationtolerance, which are important for germinating seeds. In thisexperiment, the G241 DNA sequence itself was induced 38-fold. Many ofthe induced genes were transcription factors with unknown function. BothCBF1 and CBF2 (involved in freezing tolerance) were up-regulated. Asmentioned above, several genes indicative of osmotic stress tolerancewere also up-regulated. These same gene sequences were up-regulated onarrays of plants treated with mannitol as an osmotic stress, in a CBF2overexpressor, and in cold-acclimated plants. A glucose transportersequence was also upregulated, however, this gene sequence is notup-regulated in any of the other arrays mentioned above. The phenotypeof the overexpressor was reduced seedling growth on high glucose. It ispossible that the plants were taking up more glucose. In such ascenario, the gene is not likely to be involved in sugar sensing butrather the high glucose condition is inhibiting their growth. The G241overexpressors were tested for osmotic stress tolerance using mannitol.It is possible the glucose transporter is increasing mannitol uptake andincreasing its toxicity to the plant as well. Polyethylene glycol (PEG)is an alternative osmoticum that can be tested at variousconcentrations.

Utilities

One potential utility of this gene or its equivalogs can be to engineerplants that are tolerant to stress. This can greatly impact yield.Alternatively, if this gene is involved in sugar sensing, the potentialutility of a gene involved in glucose-specific sugar sensing is to alterenergy balance, photosynthetic rate, biomass production, and senescence.Sugars are key regulatory molecules that affect diverse processes inhigher plants including germination, growth, stress responses,flowering, senescence, sugar metabolism and photosynthesis.Glucose-specific hexose-sensing has been described in plants andimplicated in cell division, and repression of famine genes(photosynthetic or glyoxylate cycles). This gene may also be used toalter oil and protein production in seeds, which may be very importantfor the nutritional quality and caloric content of foods.

G248 (SEQ ID NO: 171)

Published Information

G248 was identified at Mendel Biotechnology. Kranz et al. ((1998) PlantJ. 16:263-276) published a cDNA sequence corresponding to G248, namingit MYB22.

Experimental Observations

The function of G248 was analyzed using transgenic plants in which thegene was expressed under the control of the 35S promoter. The phenotypeof these transgenic plants was wild-type with respect to theirmorphology. However, overexpression of G248 in Arabidopsis was found toconfer greater sensitivity to disease, particularly following infectionby Botrytis cinerea. All three lines show the susceptible phenotype.

As determined by RT-PCR, G248 appears to be expressed at low levels inembryo and silique tissue. No expression was detected in other tissues.G248 appears to be induced in response to salicylic acid (SA) treatment.It is well know that both synergistic and antagonistic crosstalk betweengrowth regulator controlled defense pathways occurs in response todisease.

Utilities

Since G248 transgenic plants had an altered response to the fungalpathogen Botrytis cinerea, G248 or its equivalogs can be used tomanipulate the defense response in order to generate pathogen-resistantplants.

G254 (SEQ ID NO: 179)

Published Information

G254 was identified from the Arabidopsis BAC sequence, AF007269, basedon its sequence similarity within the conserved Myb domain to other Mybfamily members in Arabidopsis.

Experimental Observations

The function of G254 was analyzed through the ectopic overexpression ofthe gene in plants. Overexpression of G254 resulted in a reduction ofgermination and reduced seedling growth on glucose containing media.G254 may be involved in sugar sensing.

RT-PCR analysis of the endogenous levels of G254 indicated that thisgene was expressed in all tissues tested. A cDNA microarray experimentsupported the tissue distribution data by RT-PCR. There was no inductionof G254 above its basal level in response to environmental stresstreatments. G254 was constitutively expressed.

Utilities

The potential utility of G254 or its equivalogs is to alter source-sinkrelationships in the plant. Sugars are key regulatory molecules thataffect diverse processes in higher plants including germination, growth,flowering, senescence, sugar metabolism, and photosynthesis. Sucrose isthe major transport form of photosynthate and its flux through cells hasbeen shown to affect gene expression and alter storage compoundaccumulation in seeds (source-sink relationships). The potentialutilities of a gene involved in glucose-specific sugar sensing are toalter energy balance, photosynthetic rate, carbohydrate accumulation,biomass production, source-sink relationships, and senescence.Glucose-specific hexose-sensing has been described in plants andimplicated in cell division and the repression of ‘famine’ genes(photosynthetic or glyoxylate cycles).

G256 (SEQ ID NO: 183)

Published Information

G256 is equivalent to Y13, a gene that was identified by Quaedvlieg etal. ((1996) Plant Mol. Biol. 32:987-093) as being induced in etiolatedseedlings one hour after being exposed to light. The Myb consortium hasrenamed this gene MYB3 1. Quaedvlieg et al. (1996, supra) found a lowlevel of expression in stem and silique tissue with no induction inetiolated seedlings after being exposed to light. However, there wasalso a slight induction of G256 following cold treatment.

Experimental Observations

The function of G256 was analyzed through its ectopic overexpression inplants. G256 overexpressors had enhanced seedling vigor during coldgermination. These overexpressing lines were more tolerant to chillingconditions compared to wild-type controls, as seen in 12-day-oldseedlings that were transferred to cold temperatures (8° C.).

There was no difference in germination rate under normal growthconditions. The chilling tolerant phenotype is most noticeable withrespect to enhanced root growth although the cotyledons show lessanthocyanin production than wild-type controls.

Plants overexpressing G256 were also small and early bolting. In the T2,one line lacked the waxy surface on the bolts. Three lines were tolerantto cold germination and therefore co-suppression was not a likely causeof the morphological change observed in one line. An array experimentwas performed on this G256 overexpressing line. The gene itself wasinduced 3.5-fold over wild-type levels. Very few additional genesequences were significantly induced in response to G256 overexpression.Induced genes included four gene sequences of unknown function, a sugarcarrier sequence, a cell wall degrading enzyme (BGL2) sequence,pectinesterase sequence, and a proteasome subunit protein sequence.Expression of gene sequences such as allene oxidase sequence (whichcould mean downregulation of the associated jasmonate synthesispathway), and endochitinase were repressed.

RT-PCR analysis of the endogenous levels of G256 indicated that thisgene sequence was expressed primarily in shoots, flowers, and siliques.A cDNA microarray experiment confirmed this tissue distribution data byRT-PCR. There was no induction of G256 in leaves or in seedlings inresponse to environmental stress treatments.

Utilities

The potential utility of this gene or its equivalogs is to confer bettergermination and growth in the cold. The germination of many crops isvery sensitive to cold temperatures. A gene that would allow germinationand seedling vigor in the cold would have tremendous utility in allowingseeds to be planted earlier in the season with a high rate ofsurvivability.

G291 (SEQ ID NO: 209)

Published Information

G291 is referred to in the public literature as the Arabidopsis AJH 1, aplant homolog of the c-Jun coactivator. AJH1 was isolated by peptidesequencing of a subunit of the COP9 complex, an important component inlight-mediated signal transduction in Arabidopsis. It is postulated thatthe COP9 complex may modulate the activities of transcription factors inresponse to environmental stimuli. Localization experiment reveals thatAJH1 was present in monomeric form, which suggested a possibleinvolvement in other developmentally regulated processes (Kwok et al.(1998) Plant Cell 10:1779-1790). G291 is found in the sequence of thechromosome 1 BAC F 19G 10 (GenBank accession AF000657.1 GI:2098816),released by the Arabidopsis Genome Initiative. The start and stop codonswere correctly predicted.

Experimental Observations

The expression profile of G291 revealed a low, but constitutive,expression of G291 transcripts in all tissues examined. G291 transcriptlevels were similar to the wild-type controls in all the physiologicaltreatments examined as determined by RT-PCR analysis.

G291 overexpressors produced significantly more seed oil than wild-typeplants.

Utilities

G291 or its equivalogs can be used to increase seed oil content, whichmay be of nutritional value for food for human consumption as well asanimal feeds.

G325 (SEQ ID NO: 223)

Published Information

G325 was identified as a gene in the sequence of chromosome 4, ESSA IFCA contig fragment No. 3 (GenBank Accession number Z97338), released bythe European Union Arabidopsis Sequencing Project.

Experimental Observations

The function of G325 was analyzed using transgenic plants in which G325was expressed under the control of the 35S promoter. G325 overexpressingplants had more tolerance to osmotic stress in a germination assay inthree separate experiments. They had more seedling vigor than wild-typecontrol when germinated on plates containing high salt and high sucrose.No altered morphological phenotypes or altered phenotypes in thebiochemical assays were observed.

G325 was expressed at high levels in flowers and cauline leaves, and atlower levels in shoots, rosette leaves, and seedlings. G325 was inducedby auxin, cold- and heat-stress. The expression of G325 also was reducedin response to Fusarium infection or salicylic acid treatment.

Utilities

G325 or its equivalogs may be useful for enhancing seed germinationunder high salt conditions or other conditions of osmotic stress.Evaporation from the soil surface causes upward water movement and saltaccumulation in the upper soil layer where the seeds are placed. Thus,germination normally takes place at a salt concentration much higherthan the mean salt concentration in the whole soil profile. Increasedsalt tolerance during the germination stage of a crop plant would impactsurvivability and yield.

G325 or its equivalogs can also be used to engineer plants with enhancedtolerance to drought, salt stress, and freezing, at later stages.

G361 (SEQ ID NO: 237)

Published Information

G361 was first isolated by Tague et al. ((1995) Plant Mol. Biol.28:267-279) in an effort to study the sequence and the expressionpattern of C₂H₂ zinc finger protein encoding genes in Arabidopsis(Takatsuji (1998) Cell. Mol. Life Sci. 54:-582-596). The latter studyshowed that G361 (ZFP6) was mostly expressed in roots and shoots basedon Northern analysis.

Experimental Observations

A full-length cDNA was isolated and used to transform plants. G361overexpressors were small and very late bolting. The plants did not showany physiological phenotype. G361 overexpressing plants had increasedlevels of polyunsaturated fatty acids. The phenotype could be related tothe darker green color of the plants and their possible higherchlorophyll content (repeat of analysis also in progress). Higher 16:3fatty acid content, in particular, could be a reflection of a higherchloroplast number or more chloroplast membranes. RT-PCR data showedthat the gene was expressed mostly in shoots and in roots at low levels.

Utilities

The late-flowering phenotype of G361 or its equivalogs is useful in thatlate flowering is desirable in crops where the vegetative portion of theplant is harvested (often vegetative growth stops when plants make thetransition to flowering). In this case, it can be advantageous toprevent or delay flowering in order to increase yield. Also, preventionof flowering can be useful in these same crops in order to prevent thespread of transgenic pollen and/or to prevent seed set. In any case, theoverexpressors were clearly smaller, an undesirable phenotype which hasto be corrected before overexpression of the gene can lead to any usefulcrop product.

G390 (SEQ ID NO: 249)

Published Information

G390 was isolated by Ruzza et al. (GenBank Accession: CAD29544,gi:20069421) using degenerate oligonucleotides corresponding to aconserved 6 amino acid sequence from the helix-3 region of athb-1 andathb-2. It was named athb-9. The published Northern blot showed slightlyhigher level of expression in stems, and lower levels in leaves,flowers, roots, and siliques. The G390 protein shares very extensiveamino acid identity with other HD-ZIP class III proteins that exist inArabidopsis (for example, G391 and G438). HD-ZIP class III proteins areknown to have complex roles in determining meristem development,vascular tissue formation, and stem lignification (Baima et al. (1995)Development 12:4171-4182; Baima et al. (2001) Plant Physiol.126:643-655; Talbert et al. (1995) Development 121:2723-2735; Zhong etal. (1997) Plant Cell 9:2159-2170; Sessa et al. (1998) Plant Mol. Biol.38:609-622; Zhong et al. (1999) Plant Cell 11:2139-2152; Ratcliffe etal. (2000) Plant Cell 12:315317; and Otsuga et al. (2001) Plant J.25:223-236).

Experimental Observations

Fourteen 35S::G390 Ti lines were obtained which displayed a consistentmorphological phenotype; the majority of these plants were slightlysmall, had abnormal phyllotaxy, and exhibited stem bifurcations in whichshoot meristems split to form two or three separate shoots.Additionally, a significant number of these extra T1 lines floweredearlier than controls. Comparable effects were obtained byoverexpression of G391.

Utilities

The overexpression data suggest that G390 or its equivalogs has utilityin the manipulation of shoot architecture. Additionally, since a numberof the 35S::G390 lines flowered early, this gene or its equivalogs canbe used to manipulate flowering time.

G409 (SEQ ID NO: 261)

Published Information

G409, also named Athb-1, was one of the earliest plant homeodomainleucine (HD-ZIP) zipper genes cloned. It was isolated from a cDNAlibrary by highly degenerate oligonucleotides corresponding to aconserved eight amino acid sequence from the helix-3 region of thehomeodomain. The protein was found to transactivate a promoter linked toa specific DNA binding site (CAATTATTG) by transient expression assays.Overexpression of Athb-1 affected the development of palisade parenchymaunder normal growth conditions, resulting in light green sectors inleaves and cotyledons, whereas other organs in the transgenic plantsremained normal.

Experimental Observations

G409 was induced by drought and repressed by NaCl. Plants overexpressingG409 were more tolerant to infection by the fungal pathogen Erysipheorontil. In addition to the Erysiphe tolerant phenotype, theoverexpressors were slightly early flowering.

Utilities

The expression of transcription factors such as G409 or its equivalogsinvolved in plant/pathogen interaction can be modulated to manipulatethe plant defense wound or insect-response in order to generate pathogenresistant plants.

G438 (SEQ ID NO: 283)

Published Information

G438 was identified as a homeobox gene (MUP 24.4) within P1 clone MUP 24(GenBank accession number AB005246). G438 was identified as theArabidopsis REVOLUTA (REV) gene (Ratcliffe et al. (2000) Plant Cell12:315-317). Based on its mutant phenotype, REVhad previously beenidentified as having a key role in regulating the relative growth ofapical versus non-apical (cambial) meristems (Alvarez (1994) inArabidopsis: An Atlas of Morphology and Development (ed. J. Bowman),pp.188-189, New York, N.Y.: Springer-Verlag; Talbert et al. (1995)Development 121:2723-2735). The revoluta phenotype was highlypleiotropic but was characterized by a failure in development of alltypes of apical meristem: lateral shoot meristems in the axils ofcauline and rosette leaves were often completely absent, or replaced bya solitary leaf. These effects were most evident in higher order shoots,but in some cases, the primary shoot meristem also failed and terminatedgrowth in a cluster of filamentous structures. Rev floral meristemsoften failed to complete normal development and form incomplete orabortive filamentous structures. In contrast to apical meristems,structures formed by nonapical meristems, such as leaves, stems, andfloral organs often became abnormally large and contorted in the revmutant.

The features of rev mutants were similar to those of the interfascicularfiberless I (ifl1) mutant. Ifl1 was isolated during screens for mutantslacking normal stem fiber differentiation (Zhong et al. (1997) PlantCell 9:2159-2170). Wild-type Arabidopsis plants form interfascicularfibers which became lignified and added support to the inflorescencestem (Aloni (1987) Annu. Rev. Plant Physiol. 38:179204); Zhong et al.(1997) supra; Zhong et al. (1999) Plant Cell 11:2139-2152). In the ifllmutant, normal interfascicular fibers were absent and thedifferentiation of both xylary fibers and vessel elements was disrupted.In addition to these internal features, ifll mutants had secondarymorphological features very similar to those of rev. Recently the IFL1gene was cloned by Zhong et al. (1999 supra). It was found that the IFL1sequence and map position were identical to those of the REV genecloned, demonstrating that REV and IFL1 are the same gene. (Ratcliffe etal. (2000) supra).

It had been suggested that REV promotes the growth of apical meristems(including floral meristems) at the expense of non-apical meristems(Talbert et al. (1995) supra). It is not yet clear, however, whetherexpression data support such a role: strong expression of REV has beendetected in interfascicular regions and developing vascular tissue, butin-situ expression analysis of apical meristems has not yet beenreported. (Zhong et al. (1999) supra). REV is a group III HD-ZIP proteinand shares high sequence similarity (and organization) with the proteinsencoded by three other Arabidopsis genes: Athb8, Athb9, and Athbl4(Sessa et al. (1998) Plant Mol. Biol. 38:609-622). It is possible,therefore, that these genes act together in the same developmentalprocess. Supporting this suggestion, Athb8 had a similar expressionpattern to REV and was transcribed in the procambial regions of vascularbundles (Baima et al. (1995) Development 12:4171-4182).

Experimental Observations (Knockout)

G438 was initially identified as MUP24.4, a novel putative homeobox genewithin P1 clone MUP24 (GenBank Accession AB005246). Annotation wasconfirmed by isolation of the G438 cDNA: the cDNA had an in-frame stopcodon immediately 5′ to the predicted start codon and comprised 18 exonsthat had been predicted within the genomic sequence.

Plants homozygous for a T-DNA insertion in the G438 sequence wereobtained by PCR based screening of DNA pools from the Jack Collection ofinsertional mutants (Campisi et al. (1999) Plant Journal 17:699-707).The T-DNA insertion was located 466 bp downstream of the putative startcodon, and was predicted to create a null mutation. The mutation wasrecessive and produced a revoluta phenotype. Complementation crosses andsequencing of a known revoluta allele demonstrated that G438 wasREVOLUTA.

RT-PCR analyses detected G438 expression at medium to high levels in alltissues and conditions tested. Further expression analysis was possiblesince the T-DNA insertion contained an enhancer trap construct (Campisiet al. (1999) supra). GUS staining could therefore be used to reveal theexpression pattern of genes within which insertions occurred. GUSstaining of seedlings homozygous and heterozygous for the G438 T-DNAinsertion revealed very strong expression within axillary shoots. Thisexpression data correlates with the marked effects of the rev mutationon outgrowth of higher order shoots.

Experimental Observations (Overexpressor)

A full-length clone was amplified from cDNA derived from mixed tissuesamples, and 35S::G438 transformants were generated. These linesappeared wild-type in the physiological assays, but showed differencesin morphology compared with control plants. At early stages, a smallnumber of Ti plants displayed aberrant phyllotaxy and were ratherdwarfed, but these effects were inconsistent, and the majority of linesappeared wild-type. At later stages, however, around half of the primarytransformants, from two of the three T1 sowings, developed slightlylarger flatter leaves than wild type at late stages. The progeny of fourlines that had shown these phenotypes were examined in the T2generation. At late stages, plants from two of these T2 populationsagain displayed slightly broad flat leaves, but plants from the othertwo T2 populations appeared wild-type at all stages.

A single T1 plant line out of a total of 37 lines had highly aberrantshoot meristem development. At the early seedling stage, it appeared asthough the primary shoot apex of this individual had developed into aterminal leaf-like structure. Subsequent growth then continued from anaxillary shoot meristem that initiated from the base of a cotyledonpetiole. However, this effect became silenced between generations andwas not observed in the T2 progeny from one line. Given that this effectwas observed in only a single line, it could have been the result of anactivation tagged locus at the T-DNA insertion site, rather than due toG438 expression. However, the phenotype would fit with a role for REV inregulating apical meristem development.

Utilities

The mutant phenotypes indicated that REV/IFL1 or its equivalogs have animportant role in determining overall plant architecture and thedistribution of lignified fiber cells within the stem. A number ofutilities can be envisaged based upon these functions.

Modifying the activity of REVOLUTA orthologs from tree species can offerthe potential for modulating lignin content. This can allow the qualityof wood used for furniture or construction to be improved. Lignin isenergy rich; increasing lignin composition could therefore be valuablein raising the energy content of wood used for fuel. Conversely, thepulp and paper industries seek wood with a reduced lignin content.Currently, lignin must be removed in a costly process that involves theuse of many polluting chemicals. Consequently, lignin is a seriousbarrier to efficient pulp and paper production (Tzira et al. (1998)TIBTECH 16:439-446; Robinson (1999) Nature Biotechnology 17:27-30). Inaddition to forest biotechnology applications, changing lignin contentmight increase the palatability of various fruits and vegetables.

In Arabidopsis, reduced REV activity results in a reduction ofhigher-order shoot development. Reducing activity of REV orthologs maygenerate trees that lack side branches, and have fewer knots in thewood. Altering branching patterns can also have applications amongstornamental and agricultural crops. For example, applications might existin any species where secondary shoots currently have to be removedmanually, or where changes in branching pattern could increase yield orfacilitate more efficient harvesting.

G464 (SEQ ID NO: 291)

Published Information

G464 is IAA 12, a member of the Aux/IAA class of small, short-livednuclear proteins that contain four conserved domains. IAA12 was found asone of a group ofArabidopsis IAA genes that was isolated based onhomology to early auxin-induced genes of pea. IAA12 transcripts weremodestly (2 to 4-fold) induced by auxin, with optimal induction at 10 μMauxin (Abel et al. (1995) J. Mol. Biol. 251:533-549).

Experimental Observations

G464 overexpressing Arabidopsis lines showed enhanced germination inhigh heat conditions. In addition, one Arabidopsis line overexpressingG464 showed an increase in total seed protein and a decrease in totalseed oil by NIR in one assay.

Utilities

G464 or its equivalogs in native or altered form is useful to produceplants that germinate better in hot conditions.

G470 (SEQ ID NO: 295)

Published Information

A partial cDNA clone corresponding to G470 was isolated in a two-hybridscreen for proteins that interact with ARF1, a transcription factor thatbinds to auxin response elements, and this clone was named ARF1 BindingProtein (Ulmasov et al. (1997) Science 276:1865-1868). A full-lengthclone was later isolated, and the gene was renamed ARF2 (Ulmasov et al.(1999a) Proc. Natl. Acad. Sci. 96:58445849). ARF2 was shown to bind toan auxin response element (Ulmasov et al. (1999b) Plant J. 19:309319).

Co-transfection of ARF2 and a reporter construct with an auxin responseelement into carrot protoplasts did not result in either activation orrepression of transcription of the reporter gene (Ulmasov et al. (1999a)supra). ARF2 binding to palindromic auxin response elements is thoughtto be facilitated by dimerization mediated by the carboxy-terminaldomain of ARF2 (Ulmasov et al. (1999b) supra). It is possible that ARF2regulates gene expression through heterodimerization with other ARFproteins or with IAA proteins. ARF2 was found to be expressed uniformlyin roots, rosette leaves, cauline leaves, flowers, and siliques (Ulmasovet al. (1999b) supra).

Experimental Observations

Expression of a truncated G470 clone in the antisense orientation underthe 35S promoter caused infertility in Arabidopsis. In primarytransformants expressing the G470 clone, the stamens failed to elongateproperly. Pollen was produced, but was not deposited on the stigma. Thetransformants appeared otherwise morphologically normal. Because of theinfertility of the primary transformants, no material was available forbiochemical and physiological analyses. The truncated clone correspondsto the carboxy-terminal portion of the ARF2 protein, and lacks the DNAbinding domain.

Utilities

G470 or its equivalogs are useful in engineering infertility inself-pollinating plants. G475 (SEQ ID NO: 301)

Published Information

G475 was identified from an Arabidopsis thaliana cDNA library using twocDNAs from Antirrhinum majus, SPB1 and SPB2, as probes (Cardon et al.(1997) Plant J. 12: 367-377). The Arabidopsis cDNA, SPL3, whichcorresponds to G475, is the presumed ortholog of SPB1. In Antirrhinummajus, SPB1 was identified as a transcription factor that binds to thepromoter of the floral meristem identity gene, SQUAMOSA and is thereforeimplicated in a plant's transition to flowering. The overexpression ofSPL3 (G475) in Arabidopsis resulted in plants that were early bolting,with fewer leaves, and plants that often an extra bract subtending thefirst flower (Cardon et al. (1997) supra). SPL3 antisense plantsproduced by these authors had no phenotypic differences from wild type.

Closely Related Genes from Other Species

The closest relative to G475 is its homolog in A. majus, SPPB1.

Experimental Observations

The function of G475 was analyzed through its ectopic overexpression inplants. The morphological phenotype of the G475 overexpressors confirmedthe published phenotype. An array experiment was performed on a G475overexpressing line. The gene itself was overexpressed six-fold on thisarray. G475 overexpressors had a flowering phenotype: they were earlybolting and had an abnormal inflorescence structure. There were twogenes induced on the array that were consistent with this floweringphenotype. One was AGL8, a gene expressed in the inflorescence meristem,stem, cauline leaves and developing fruits (Mandel et al. (1995) PlantCell. 7:1763-1771), whose function appears to be in fruit development(Gu et al.(1998) Development 125:1509-1517). The other gene that wasconsistent with the abnormal inflorescence structure phenotype was agene MFP2 (fatty acid multi-functional protein) that is involved inbeta-oxidation of fatty acids. Alterations in the MPF2 protein have beenreported to cause alterations in inflorescence development in plants.

Utilities

The potential utility of plants with an early bolting phenotypeincludes: (1) early flowering could accelerate many conventionalbreeding programs, (2) early flowering without an impact on yield couldshorten generation time and allow for multiple harvests per season, and(3) an inducible system that would allow rapid triggering of floweringwould allow synchronization of flowering time, which is important in thehorticulture industry as well as for any crop that is mechanicallyharvested.

G477 (SEQ ID NO: 303)

Published Information

G477 corresponds to SPL6 (AJ011643, Cardon et al. (1999) Gene237:91-104), a member of the SBP family of transcription factors. G477is expressed constitutively throughout the development of Arabidopsis.Outside the SBP-domain, G477 has a putative myc-like helix-loop-helixdimerization domain (Cardon et al. (1999) supra).

Experimental Observations

The complete sequence of G477 was determined. The function of this genewas analyzed using transgenic plants in which G477 was expressed underthe control of the 35S promoter. The phenotype of these transgenicplants was wild-type in all morphological and biochemical assaysperformed.

Plants overexpressing G477 were slightly more sensitive to theherbicides glyphosate and acifluorfen and to oxidative stress caused byrose bengal compared with wild-type controls. Plants overexpressing G477also develop more disease symptoms following inoculation with a moderatedose of Sclerotinia sclerotiorum compared with control plants. It iswell known that oxidative stress is a component of a plant defenseresponse to pathogen and therefore the disease susceptibility phenotypecould be related to a general sensitivity to oxidative stress.

G477 was expressed in all tissues and under all conditions tested inRT-PCR and cDNA microarray experiments.

Utilities

G477 activity was shown to affect the response of transgenic plants tothe fungal pathogen Sclerotinia sclerotiorum and oxidative stresstolerance. Therefore, G477 or its equivalogs can be used to manipulatethe defense response in order to generate pathogen-resistant plants.

G482 (SEQ ID NO: 305)

Published Information

G482 is equivalent to AtHAP3b which was identified by Edwards et al.((1998) Plant Physiol. 117:1015-1022) as an EST with homology to theyeast gene HAP3b. Edwards' northern blot data suggests that AtHAP3b isexpressed primarily in roots. No other functional information regardingG482 is publicly available.

Experimental Observations

G482 function was analyzed through its ectopic overexpression in plantsunder the control of a 35S promoter. G482 overexpressors were moretolerant to high NaCl in a germination assay.

RT-PCR analysis of endogenous levels of G482 transcripts indicated thatthis gene was expressed constitutively in all tissues tested. A cDNAarray experiment supported the RT-PCR derived tissue distribution data.G482 was not induced above basal levels in response to any environmentalstress treatments tested.

Utilities

The utilities of this gene or its equivalogs include the ability toconfer salt tolerance during the germination stage of a crop plant. Thiswould most likely impact survivability and yield. Evaporation of waterfrom the soil surface causes upward water movement and salt accumulationin the upper soil layer, where the seeds were placed. Thus, germinationnormally takes place at a salt concentration much higher than the meansalt concentration in the whole soil profile.

G489 (SEQ ID NO: 309)

Published Information

G489 was identified from a BAC sequence that showed high sequencehomology to AtHAP5-like transcription factors in Arabidopsis. Nopublished information is available regarding the function of this gene.

Experimental Observations

The function of G489 was analyzed through its ectopic overexpression inplants. G489 overexpressors were more tolerant to high NaCl stress,showing more root growth and leaf expansion compared with the controlsin culture. Two well characterized ways in which NaCl toxicity ismanifested in the plant is through general osmotic stress and potassiumdeficiency due to the inhibition of its transport. These G489overexpressor lines were more tolerant to osmotic stress in general,showing more root growth on mannitol containing media.

RT-PCR analysis of endogenous levels of G489 transcripts indicated thatthis gene was expressed constitutively in all tissues tested. A cDNAarray experiment confirmed the RT-PCR derived tissue distribution data.G489 was not induced above basal levels in response to the stresstreatments tested.

Utilities

The utilities of this gene or its equivalogs include the ability toconfer salt tolerance during the growth and developmental stages of acrop plant. This would impact yield and or biomass.

G509 (SEQ ID NO: 317)

Published Information

G509 was identified in the sequence of BAC F2009, GenBank accessionnumber AL021749, released by the Arabidopsis Genome Initiative.

Experimental Observations

The function of G509 was analyzed using transgenic plants in which G509was expressed under the control of the 35S promoter, as well as using aline homozygous for a T-DNA insertion in G509. The T-DNA insertion ofG509 at nucleotide position+1583 with respect to the start ATG codon wasapproximately half way into the coding sequence of the gene andtherefore was likely to result in a null mutation. G509 primarytransformants showed no significant morphological differences fromcontrol plants, though one T2 line was noted to be small and sickly atthe seedling and rosette stages, and pale and late flowering at theflowering stage. Knockout plants showed no consistent morphologicaldifferences from controls. G509 knockout plants may be more susceptibleto infection with a moderate dose of the fungal pathogen Erysipheorontii; 8 out of 8 plants tested showed more fungal growth comparedwith the wild-type controls. G509 lines had significantly higher levelsof chlorophyll a, and lower levels of chlorophyll b in seeds.

G509 knockout mutants produced more seed oil and more seed protein thanwild-type control plants.

Endogenous G509 was expressed constitutively in all tissues tested, withthe highest levels of expression in shoots, roots, flowers and siliques.

Utilities

G509 or its equivalogs can be used to produce plants with altered seedoil and seed protein content.

G509 or its equivalogs can be used to manipulate the defense response inorder to generate pathogen-resistant plants.

In addition, G509 or its equivalogs can be used to regulate the levelsof chlorophyll in seeds.

G545 (SEQ ID NO: 345)

Published Information

G545 was discovered independently by two groups. Lippuner et al. (1996)J. Biol. Chem. 271:12859-12866) identified G545 as anArabidopsis cDNA(STZ), which increases the tolerance of yeast to Li⁺ and Na⁺. They foundthat STZ expression is most abundant in leaves and roots, and that itslevel of expression increases slightly upon exposure of the plant tosalt. The second group (Meissner et al (1997) Plant Mol. Biol.33:615-624), identified G545 (ZAT10) in a group of Arabidopsis C₂H₂ zincfinger protein-encoding cDNAs that they isolated by degenerate PCR.According to their data, ZAT10 is expressed in roots, shoots and stems.

Closely Related Genes from Other Species

A closely related non-Arabidopsis sequence is a cDNA from thenitrogen-fixing species Datisca glomerata (AF119050). The similarity ofthis sequence with G545 extends beyond the conserved domain.

Experimental Observations

Plants overexpressing G545 flowered early, and in extreme cases wereinfertile. G545 overexpression conferred tolerance of transgenic plantsto phosphate deficiency. This could be the result of insensitivity tophosphate, higher rates of phosphate assimilation or larger stores ofphosphate. G545 overexpressors also appeared to be more sensitive toNaCl than wild-type plants. This result was unexpected, since yeastcells overexpressing G545 are more tolerant to salt stress than controlcells. There may be a dominant negative effect in plants, triggered bythe over-accumulation of the G545 protein, which does not exist inyeast.

G545 overexpressing plants appeared to be significantly more susceptibleto pathogens than control plants. This implied a role for the G545 inthe control of defense mechanisms.

Utilities

G545 or equivalog overexpression may result in tolerance to phosphatedeficiency. Young plants have a rapid intake of phosphorous, so it isimportant that seed beds have high enough content in phosphate tosustain their growth. Also, root crops such as carrot, potato andparsnip will all decrease in yield if there is insufficient phosphateavailable. Phosphate costs represent a relatively small but significantportion of farmers' operating costs (3-4% of total costs to a cornfarmer in the US, higher to a vegetable grower). Plants that aretolerant to phosphate deficiency can represent a cost saving forfarmers, especially in areas where soils are very poor in phosphate.

Another desirable phenotype, salt tolerance, may arise from G545 orequivalog silencing rather than overexpression. Additionally, G545appeared to be induced by cold, drought, salt and osmotic stresses,which was in agreement with a potential role of the genes in protectingthe plant in such adverse environmental conditions.

G545 also appears to be involved in the control of defense processes.However, overexpression of G545 made Arabidopsis plants more susceptibleto disease. This negative effect will have to be corrected before G545can be used in a crop to induce tolerance to low phosphate, such as byrestricting overexpression of G545 or its equivalogs to roots.

G561 (SEQ ID NO: 359)

Published Information

G561 is the Arabidopsis gene GBF2 (Schindler et al (1992) EMBO J.111:1261-1273), which was cloned by hybridization to GBF1. GBF2 isconstitutive in both light and dark grown leaves, expressed in roots,and the nuclear import of GBF1 may be light regulated (Terzaghi et al(1997) Plant J. 11: 967-982).

Closely Related Genes from Other Species

Close relatives of G561 include a G-box binding protein from Sinapisalba (Y16953; unpublished) and a G-Box binding protein from Raphanussalivus (X92102, unpublished).

Experimental Observations

The function of G561 was analyzed using transgenic plants in which thisgene was expressed under the control of the 35S promoter. Plantsover-expressing G561 showed more root growth on potassium free media.Expression of G561 also appears to be constitutive, and may bepreferentially expressed in siliques and moderately inducible with heatstress.

An important aspect of the potassium root growth assay is that plantswere firstly germinated on media with potassium and then transferredonto potassium-free media. G561 overexpressors may have be able tosomehow cope with less potassium, and it is also possible that G561overexpressors accumulated more potassium before they were transferred,which allowed the roots to grow more vigorously after transfer.

As measured by NIR, G561 overexpressors were found to have increasedseed oil content compared to wild-type plants.

Utilities

G561 or its equivalogs could be used to increase seedling vigor or plantgrowth in soils that are low in potassium. Potassium is a macronutrientrequired for a variety of basic plant functions which is commonly addedto soil as a fertilizer. The ability to grow plants on low potassiumsoils may save the ecological and material cost of soil fertilization.

G561 or its equivalogs may also be used to manipulate sterolcomposition., and may be used to modify seed oil content in plants,which may be very important for the nutritional value and production ofvarious food products.

G562 (SEQ ID NO: 361)

Published Information

G562 is the published Arabidopsis transcription factor GBF3, which wascloned through its hybridization with GBF1 (Schindler et al. (1992) EMBOJ. 11: 1261-1273). GBF3, like GBF1 and GBF2, can bind G-box elements asa homodimer, or as a heterodimer with other bZIP family members. GBF3appears to be highly expressed in roots in comparison to leaves, andrepressed by light. GBF3 binds to Gbox elements in the Arabidopsis ADHpromoter in vitro, is induced by ABA in suspension cultures, and isproposed to be the transcription factor responsible for the ABAregulated ADH gene expression (Lu et al. (1996) Plant Cell. 8:847-857).

Closely Related Genes from Other Species

Similar genes to G562 include the B. napus proteins BnGBF1 and BnGBF2(U27107 and U27108) which are strikingly similar to G562 for theirentire lengths. An unpublished Catharanthus roseus G-box binding protein1 protein (AF084971) also has significant homology to G562 outside ofthe conserved domain.

Experimental Observations

G562 appeared to be preferentially expressed in root and flower tissuesby RT-PCR analysis, and expressed at lower levels in other tissues ofthe plant. G562 was induced by heat, drought and osmotic stress inseedlings. The function of G562 was analyzed using transgenic plants inwhich G562 was expressed under the control of the 35S promoter. Plantsoverexpressing G562 were consistently and significantly later flowering,with more crinkled leaves than wild-type plants.

Utilities

G562 or its equivalogs could be used to manipulate flowering time inplants.

G567 (SEQ ID NO: 369)

Published Information

G567 was discovered as a bZIP gene in BAC T10P11, accession numberAC002330, released by the Arabidopsis genome initiative.

Closely Related Genes from Other Species

G567 is similar to two bZIP factors from Petroselinum crispum (1806261)and Glycine max (1905785). Similarity between these two proteins and theprotein encoded by G567 extends beyond the conserved domains and thusthey may have a function and utility to G567.

Experimental Observations

The annotation of G567 in BAC AC002330 was experimentally confirmed andthe function of G567 was analyzed using transgenic plants in which G567was expressed under the control of the 35S promoter.

Seedlings overexpressing G567 had slowly opening cotyledons and veryshort roots when grown on MS plates containing glucose. G567 is thuslikely to be involved in sugar sensing or metabolism during germination.

As measured by NIR analysis, plants overexpressing G567 had an increasein total combined seed oil and seed protein content.

G567 appears to be constitutively expressed, and induced in leaves in avariety of conditions.

Utilities

G567 or its equivalogs may be useful in manipulating seed oil andprotein content.

G567 or its equivalogs may be used to modify sugar sensing.

In addition to their important role as an energy source and structuralcomponent of the plant cell, sugars are central regulatory moleculesthat control several aspects of plant physiology, metabolism anddevelopment. It is thought that this control is achieved by regulatinggene expression and, in higher plants, sugars have been shown to repressor activate plant genes involved in many essential processes such asphotosynthesis, glyoxylate metabolism, respiration, starch and sucrosesynthesis and degradation, pathogen response, wounding response, cellcycle regulation, pigmentation, flowering and senescence.

Because sugars are important signaling molecules, the ability to controleither the concentration of a signaling sugar or how the plant perceivesor responds to a signaling sugar could be used to control plantdevelopment, physiology or metabolism. For example, the flux of sucrose(a disaccharide sugar used for systemically transporting carbon andenergy in most plants) has been shown to affect gene expression andalter storage compound accumulation in seeds. Manipulation of thesucrose signaling pathway in seeds may therefore cause seeds to havemore protein, oil or carbohydrate, depending on the type ofmanipulation. Similarly, in tubers, sucrose is converted to starch whichis used as an energy store. It is thought that sugar signaling pathwaysmay partially determine the levels of starch synthesized in the tubers.The manipulation of sugar signaling in tubers could lead to tubers witha higher starch content.

Thus, manipulating the sugar signal transduction pathway may lead toaltered gene expression to produce plants with desirable traits. Inparticular, manipulation of sugar signal transduction pathways could beused to alter source-sink relationships in seeds, tubers, roots andother storage organs leading to increase in yield.

G584 (SEQ ID NO: 385)

Published Information

G584 was identified in chromosome IV BAC T6K21 sequence (gene T6K21.10)by the EU Arabidopsis sequencing project as “bHLH protein-like”.

Closely Related Genes from Other Species

A related gene to G584 is Phaseolus vulgaris phaseolin G-box bindingprotein PG1 (U18348). Similarity between G584 and PG1 extends beyond thesignature motif of the family. No functional information is availablefor gene PG1 other than that the protein binds to a G-box motif CACGTGof the bean seed storage protein beta-phaseolin gene.

Experimental Observations

The function of G584 was analyzed using transgenic plants in which G584was expressed under the control of the 35S promoter. G584 transgenicplants seemed to produce seed of a larger size than control plants.Analysis of G584 overexpressors revealed no apparent physiological orbiochemical changes when compared to wild-type control plants. Analysisof the endogenous expression level of G584, as determined by RT-PCR,revealed a moderate and constitutive expression level in all Arabidopsistissues examined. G584 transcript level remained similar to wild-typecontrols in all the treatments examined.

Utilities

G584 or its equivalogs could be used to produce larger seed size and/oraltered seed morphology, which may positively influence seed storagecharacteristics, appearance and yield.

G590 (SEQ ID NO: 387)

Published Information

The sequence of G590 was obtained from the Arabidopsis genome sequencingproject, GenBank accession number Z99707, based on its sequencesimilarity within the conserved domain to other bHLH/Myc relatedproteins. A knockout mutant in G590, named as SPATULA, has also beenisolated and characterized (Heisler et al. (2000) Development128:1089-1098).

Experimental Observations

The function of this gene was studied by knockout analysis and by usingtransgenic plants in which G590 was expressed under the control of the35S promoter.

G590 knockout plants produced more seed oil than wild-type controls.

Overexpression of G590 resulted in a reduction in flowering time and ashorter generation time. Under continuous light conditions, G590overexpressing plants typically produced visible flower budsapproximately one week earlier than wild-type controls. At the time ofbolting, these plants had 4-8 rosette leaves compared with 8-11 in wildtype. Additionally, G590 overexpressor had rather pointed leaves atearly stages of development. The plants also appeared slightly small,yellow, and later, had elongated leaf petioles. No other physiologicaland biochemical alterations were observed in the overexpressiontransgenic plants when compared to wild-type controls.

Gene expression profiling using RT-PCR shows that G590 was relativelyexpressed at higher levels in flowers, siliques and roots. Itsexpression level was unaffected by any of the conditions tested.

Utilities

G590 or its equivalogs could be used to increase seed oil content, whichwould be of nutritional value for food for human consumption as well asanimal feeds.

Based on the current analysis of G590 overexpressing plants, G590 or itsequivalogs could be used to manipulate flowering time. A wide variety ofapplications exist for systems that shorten the time to flowering.

G592 (SEQ ID NO: 391)

Published Information

The genomic sequence of G592 has been determined as part of theArabidopsis Genome Initiative (BAC clone T24P15, gene T24P15.19, GenBankaccession number AC002561). There is no other published informationavailable regarding the function of G592.

Experimental Observations

The function of G592 was analyzed using transgenic plants in which G592was expressed under the control of the 35S promoter. Plantsoverexpressing G592 were marginally early flowering. Many of the T1plants were smaller than wild-type controls. Analysis of G592overexpressors revealed no apparent physiological or biochemical changeswhen compared to wild-type control plants. Analysis of the endogenousexpression levels of G592, as determined by RT-PCR, revealed a moderateand constitutive expression level in all Arabidopsis tissues examined,although expression in roots was slightly higher. G592 expressionappeared to be repressed by ABA, cold and salt treatments.

Utilities

Based on the current analysis of G592 overexpressing plants, G592 couldbe used to manipulate flowering time.

G598 (SEQ ID NO: 393)

Published Information

G598 was identified in chromosome II BAC T6D20 sequence (gene T6D20.23)by The Institute for Genomic Research as an “unknown protein”.

Experimental Observations

cDNAs representing two splice variants of G598 were identified. Thesesplice variants differ in the 3′ end region and would produce proteinswith different C-termini. The function of G598 was analyzed usingtransgenic plants in which splice variant number 1 of G598 was expressedunder the control of the 35S promoter. G598 overexpressors had higherseed oil content in all three lines tested when measured by NIR. Thesethree lines also showed increased galactose levels when insoluble sugarcomposition was determined. Otherwise, G598 overexpressors behavedsimilarly to wild-type controls in all biochemical assays performed. Thecharacterization of G598 overexpressors revealed no apparentmorphological or physiological changes when compared to wild-typecontrol plants. Analysis of the endogenous expression level of G598, asdetermined by RT-PCR, revealed a moderate and constitutive expressionlevel in all tissues and conditions examined.

One transgenic line showed a reproducible increase in galactose inleaves.

Utilities

On the basis of the biochemical analyses performed to date, G598 or itsequivalogs may play a role in the accumulation or regulation of leafinsoluble sugars. Insoluble sugars are among the building blocks ofplant cell walls. Transcription factors that alter plant cell wallcomposition such as galactose have several potential applicationsincluding altering food digestibility, plant tensile strength, woodquality, pathogen resistance and in pulp production. In particular,increasing the insoluble carbohydrate content in various fruits,vegetables, and other edible consumer products will result in enhancedfiber content. Increased fiber content would not only provide healthbenefits in food products, but might also increase digestibility offorage crops.

G598 or its equivalogs could be used to increase seed oil content, whichwould be of nutritional value for food for human consumption as well asanimal feeds.

G624 (SEQ ID NO: 403)

Published Information

G624 was identified in the sequence of BAC F18E5, GenBank accessionnumber AL022603, released by the Arabidopsis Genome Initiative. Nofurther public or published information is available about the functionof G624.

Experimental Observations

Overexpression of G624 produced a moderate delay in the onset offlowering (approximately one week under continuous light conditions). Anumber of the late flowering 35S::G624 transformants also displayed amarked increase in vegetative biomass compared to controls. No alteredphenotypes were detected in any of the physiological assays.

Intriguingly, overexpression lines containing a truncated form of thecDNA (see sequence comments) exhibited wild-type morphology butdisplayed enhanced tolerance to both high sodium chloride and lowphosphate growth conditions. It is possible that this effect representsa dominant negative phenotype.

Utilities

The delayed flowering displayed by 35S::G624 transformants suggests thatthe gene might be used to manipulate the flowering time of commercialspecies. In particular, an extension of vegetative growth or an increasein leaf size can significantly increase biomass and result insubstantial yield increases.

Based on the increased salt tolerance exhibited by the 35S::G624 linesin physiology assays, this gene might be used to engineer salt tolerantcrops and trees that can flourish in salinified soils, or under droughtconditions.

The response of 35S::G624 seedlings to low phosphate conditions suggeststhat the gene could be used to manipulate nutrient uptake, or theability to grow in poor nutrient soils.

G627 (SEQ ID NO: 405)

Published Information

G627 corresponds to AGAMOUS-LIKE 19 (AGL19) which was isolated byAlvarez-Buylla et al. (2000) Plant J. 24, 457-466). No geneticcharacterization of AGL19 has been reported, but it was found to bespecifically expressed in the outer layers of the root meristem (lateralroot cap and epidermis) and in the central cylinder cells of matureroots (Alvarez-Buylla et al. (2000) supra).

Experimental Observations

RT-PCR expression studies failed to detect G627 in any of the tissuetypes analyzed. This result partially agreed with the data ofAlvarez-Buylla et al. (2000) supra, who found that the gene is expressedonly in specific regions of the root. It is possible that such regionswere not sufficiently represented for G627 transcripts to be detected inthe whole root samples analyzed in our expression studies.

In later experiments, however, a G627 clone was isolated by high cyclePCR from a cDNA sample derived from mixed tissues, and transgenic lineswere generated in which this clone was expressed from a 35S promoter. Asubstantial proportion of the 35S::G627 lines flowered markedly earlierthan control plants. Such effects were observed in both the T1 and T2generations and indicated that the gene plays a role in the regulationof flowering time.

Utilities

Given the early flowering seen amongst the 35S::G627 transformants, thegene might be used to manipulate the flowering time of commercialspecies. In particular, G627 could be used to accelerate flowering oreliminate any requirement for vernalization. In some instances, a fastercycling time might allow additional harvests of a crop to be made withina given growing season. Shortening generation times could also helpspeed-up breeding programs, particularly in species such as trees, whichtypically grow for many years before flowering. Conversely, it might bepossible to modify the activity of G627 or its orthologs to delayflowering in order to achieve an increase in biomass and yield.

G634 (SEQ ID NO: 415)

Published Information

G634 was initially identified as public partial cDNAs sequences for GTL1and GTL2 which are splice variants of the same gene (Smalle et al (1998)Proc. Natl. Acad. Sci. USA 95:3318-3322). The published expressionpattern of GTL1 shows that G634 is highly expressed in siliques and notexpressed in leaves, stems, flowers or roots.

Closely Related Genes from Other Species

A close non-Arabidopsis relative of G634 is 0. sativa the gt-2 gene (2)which is proposed to bind and regulate the phyA promoter. In addition,the pea DNA-binding protein DFI (13786451) shows strong homology toG634. The homology of these proteins to G634 extends to outside of theconserved domains and thus these genes are likely to be orthologs ofG634.

Experimental Observations

The boundaries of G634 were experimentally determined and the functionof G634 was investigated by constitutively expressing G634 using theCaMV 35S promoter.

Three constructs were made for G634: P324, P1374 and P1717. P324 wasfound to encode a truncated protein. P1374 and P1717 represent fulllength splice variants of G634; P1374, the shorter of the two splicevariants was used for the experiments described here and the codingsequence of the P1374 clone is provided as the cDNA sequence for G634 inthe Sequence Listing. The longest available cDNA (P1717), confirmed byRACE, had the same ATG and stop codons as the genomic sequence. Onlydata for P1374 are presented here.

Plants overexpressing G634 from construct P1374 had a dramatic increasethe density of trichomes, which were also larger in size. The increasein trichome density was most noticeable on later arising rosette leaves,cauline leaves, inflorescence stems and sepals with the stem trichomesbeing more highly branched than controls. Approximately half of theprimary transformants and two of three T2 lines showed the phenotype.Apart from slight smallness, there did not appear to be any other clearphenotype associated with the overexpression of G634. However, areduction in germination was observed in T2 seeds grown in culture.

RT PCR data showed that G634 was preferentially expressed in flowers andgerminating seedlings, and induced by auxin.

Utilities

G634 or its equivalogs may be used to alter trichome structure, functionor density. Trichome glands on the surface of many higher plants produceand secrete exudates that give protection from the elements and pestssuch as insects, microbes and herbivores. These exudates may physicallyimmobilize insects and spores, may be insecticidal or ant-microbial orthey may allergens or irritants to protect against herbivores. Trichomeshave also been suggested to decrease transpiration by decreasing leafsurface air flow, and by exuding chemicals that protect the leaf fromthe sun.

Depending on the plant species, varying amounts of diverse secondarybiochemicals (often lipophilic terpenes) are produced and exuded orvolatilized by trichomes. These exotic secondary biochemicals, which arerelatively easy to extract because they are on the surface of the leaf,have been widely used in such products as flavors and aromas, drugs,pesticides and cosmetics. One class of secondary metabolites, thediterpenes, can effect several biological systems such as tumorprogression, prostaglandin synthesis and tissue inflammation. Inaddition, diterpenes can act as insect pheromones, termite allomones,and can exhibit neurotoxic, cytotoxic and antimitotic activities. As aresult of this functional diversity, diterpenes have been the target ofresearch several pharmaceutical ventures. In most cases where themetabolic pathways are impossible to engineer, increasing trichomedensity or size on leaves may be the only way to increase plantproductivity.

Thus, the use of G634 and its homologs to increase trichome density,size or type may therefore have profound utilities in so calledmolecular farming practices (i.e. the use of trichomes as amanufacturing system for complex secondary metabolites), and inproducing resistant insect and herbivore resistant plants.

G636 (SEQ ID NO: 417)

Published Information

G636 was identified through partial EST AA395524, released by MichiganState University. The entire sequence of G636 was later identified inBAC F7012, accession number F7012, released by the Arabidopsis genomeinitiative.

Closely Related Genes from Other Species

G636 is closely related to the Pisum sativum DNA-binding protein DF 1,accession number AB052729, which may bind to light regulatory elements.

Experimental Observations

The 5′ boundary of G636 was determined and the function of G636 wasanalyzed by constitutively expressing the gene using the CaMV 35Spromoter. Overexpression of G636 resulted in premature senescence ofleaves and reduced plant size and fertility. No other phenotypicalterations were noted as a result of physiological or biochemicalanalyses.

G636 was constitutively expressed.

Utilities

G636 or its equivalogs may be used to alter senescence responses inplants. Although leaf senescence is thought to be an evolutionaryadaptation to recycle nutrients, the ability to control senescence in anagricultural setting has significant value. For example, a delay in leafsenescence in some maize hybrids is associated with a significantincrease in yields and a delay of a few days in the senescence ofsoybean plants can have a large impact on yield. Delayed flowersenescence may also generate plants that retain their blossoms longerand this may be of potential interest to the ornamental horticultureindustry.

G638 (SEQ ID NO: 419)

Published Information

G638 was identified in the sequence of BAC F17C15, GenBank accessionnumber AL162506, released by the Arabidopsis Genome Initiative. Duringthe course of its functional analysis, G638 was identified as the PETALLOSS gene (Griffith et al. (1999) Development: 126:5635-5644). The PETALLOSS knockout mutant displays a variety of flower phenotypes, moststrikingly characterized by a reduction in the number of petals. Inaddition to flower organ number, organ identity, shape and orientation,particularly of petals, is altered.

Closely Related Genes from Other Species

A relative of G638 is a Medicago truncatula gene represented by the ESTBF646615, which was isolated from an elicited cell culture cDNA library.

Experimental Observations

The boundaries of G638 were experimentally determined and the functionof G638 was analyzed using transgenic plants in which this gene wasexpressed under the control of the 35S promoter. Expression of G638causes severe alterations of plant development. The most strikingfeature of these overexpressor plants was that they have multipetallateflowers. In early flowers, some homeotic conversion had occurred betweensome organs of the flower. In all flowers made after these earlyflowers, petal number had been altered. Up to eight petals wereconsistently observed on plants that flowered, and as the plants grewolder, the number of petals on new flowers was reduced from eight toabout five. This phenotype was somewhat opposite to the phenotypeobserved with PETAL LOSS knockout plants and confirms a role for G638 incounting or maintaining petal number within the Arabidopsis flower. Inaddition to the flower phenotype, G638 caused alterations in phyllotaxy,leaf shape and caused plants to be sterile. G638 appears to beconstitutively expressed.

Utilities

G638 or its equivalogs could be used to manipulate plant architectureand leaf shape, in particular this gene could be used to increase ordecrease petal number in flowers. Overexpression of G638 also causessterility, indicating there may be some use for this gene in engineeringsterility into commercially relevant species.

G663 (SEQ ID NO: 435)

Published Information

G663 was identified from the Arabidopsis EST sequence, H76020, based onits sequence similarity within the conserved domain to other Myb familymembers in Arabidopsis. This gene was named MYB90 (Kranz et al. (1998)Plant J. 16:263-276). Reverse Northern data suggested G663 is expressedhighly in leaves, siliques, and flowers and is induced by ethylenetreatment.

Experimental Observations

The function of G663 was analyzed by its ectopic overexpression inplants. G663 overexpressors had constitutive anthocyanin production inseeds and roots. One line had higher anthocyanin production in leaftissue as well. In other overexpressing lines, constitutive anthocyaninproduction was noted in trichomes and leaf margins. The overproductionof pigment in select tissues suggests there may be another transcriptionfactor with which G663 interacts to activate the pathway. Using the cornsystem as a model, the interacting protein may be a bZIP liketranscription factor.

RT-PCR analysis of the endogenous levels of G663 indicated that thisgene was expressed primarily in siliques and seedlings. Array dataconfirmed the high levels in silique and also detected high levels ofG663 in germinating seed tissue. G663 transcripts were also inducedabove basal levels by all stress treatments tested except by infectionwith Erysiphe orontil. These data were consistent with G663 beinginvolved in the anthocyanin biosynthetic pathway, which is part of acommon multi-stress response pathway.

Utilities

The potential utilities of this gene or its equivalogs includesalterations in pigment production for horticultural purposes, andpossibly increasing stress resistance in combination with anothertranscription factor. Flavonoids have antimicrobial activity and couldbe used to engineer pathogen resistance. Several flavonoid compoundshave health promoting effects such as the inhibition of tumor growth andcancer, prevention of bone loss and the prevention of the oxidation oflipids. Increasing levels of condensed tannins, whose biosyntheticpathway is shared with anthocyanin biosynthesis, in forage legumes is animportant agronomic trait because they prevent pasture bloat bycollapsing protein foams within the rumen. For a review on the utilitiesof flavonoids and their derivatives, refer to Dixon et al. ((1999)Trends Plant Sci. 10: 394-400).

G664 (SEQ ID NO: 437)

Published Information

G664 was identified from the Arabidopsis EST sequence, N38154, based onits sequence similarity within the conserved domain to other Myb familymembers in Arabidopsis. The Myb consortium named this gene MYB4 (Kranzet al. (1998) Plant J. 16: 263-276). Reverse Northern data suggestedG664 is expressed highly in silique tissue with a low level ofexpression detected in all other tissues.

Closely Related Genes from Other Species

G664 shows extensive homology to the tomato gene THM27 (X95296) and thebarley gene (X70877).

Experimental Observations

The function of G664 was analyzed through its ectopic overexpression inplants. G664 overexpressors germinated better and then developed morerapidly in cold conditions (8° C.) than wildtype controls. Nodifferences in germination rates were observed on control MS media or inresponse to any other stress. Array data indicated that G664 wasnormally expressed primarily in root, shoot and silique.

Utilities

The potential utility of this gene or its equivalogs is to conferimproved cold germination and/or growth. The germination of many cropslike cotton is very sensitive to cold temperatures, a gene that wouldallow germination and seedling vigor in the cold would have tremendousutility in allowing seeds to be planted earlier in the season with ahigh rate of survivability.

G676 (SEQ ID NO: 457)

Published Information

G676 was identified from an Arabidopsis EST, N96391, based on itssequence similarity to other members of the Myb family within theconserved domain. The Myb consortium named this gene MYB66 (Kranz H D,et al.(1998) Plant J. 16:263-276) and in a report by Lee et al ((1999)Cell 1999 24;99:473483) a detailed functional analysis of G676, or“werewolf”, is described. Werewolf (WER) is involved inposition-dependent patterning of epidermal cell types. Transcripts werelocalized to root epidermal cells that will develop into non-hair cells.WER was shown to regulate the position-dependent expression of GLABRA2,to interact with the maize R gene, and to act as an antagonist to themyb protein CAPRICE (G225). These authors do not report altered trichomepositioning in their 35S:wer overexpressors.

Experimental Observations

The function of G676 was analyzed through its ectopic overexpression inplants. Morphologically, the plants are small, and partially glabrous onthe upper surface of the leaf. Ectopic trichomes developed on theunderside of the leaf in one line. Lee et al (1999) Cell 99: 473-483)fail to report altered trichome phenotypes in the leaves of the 35S:wereoverexpression lines. The present lines showed a higher degree ofoverexpression, which could explain the small stature of the plants aswell.

RT-PCR analysis of the endogenous levels of G676 indicated that thisgene was expressed primarily in roots with a low level of expression insiliques and seedlings. G676 transcripts were not induced significantlyabove basal levels by any stress-related treatments tested. Indisease-related treatments where whole seedlings were harvested,transcripts were detectable but not above basal levels. This may berelated to the gene's root expression. G676 transcripts were not foundin Fusarium oxysporum treated seedlings; it is possible this treatmentrepresses G676 expression in the roots.

Utilities

The potential utility of G676 or its equivalogs is the production ofectopic trichomes on the surface of the leaf. It would be of significantagronomic value to have plants that exhibit greater numbers of glandulartrichomes producing essential oils for the pharmaceutical and foodindustries, as well as oils that protect plants against insect andpathogen attack.

G680 (SEQ ID NO: 463)

Published Information

G680 or LHY (late elongated hypocotyl) is an unusual Myb transcriptionfactor in that it contains a single Myb repeat instead of the two repeatsequences found in the majority of plant Myb genes (R2R3 Mybs). Thereare over 30 members of this single repeat Myb-related subfamily in theArabidopsis genome. Both signature repeats in R2R3 Myb domain arerequired for sequence specific DNA binding. However, the Myb-relatedsubfamily with a single repeat domain are also able to bind to DNA in asequence-specific manner (Baranowskij et al. (1994) EMBO J. 13:5383-5392; Feldbrugge et al. (1997) Plant J. 11: 1079-1093) and aretherefore thought to function as transcription factors.

G680 or LHY overexpression affects many processes associated with thecircadian clock including, the rythmicity in both leaf movement, and theexpression of CAB and CCR2 genes, as well as photoperiodic control offlowering time (Schaffer et al. (1998) Cell 93: 1219-1229). Otherreported piciotropic effects include elongated hypocotyls, elongatedpetioles, and pale leaves (Schaffer et al. (1998) Cell 93: 1219-1229).All of these phenotypes could potentially be explained by the impairmentof circadian clock function. LHY shows a high degree of homology toCCA1, another protein implicated in circadian clock function (Wang etal. (1997) Plant Cell 9: 491-507).

Experimental Observations

The function of G680 was analyzed through its ectopic overexpression inplants. G680 overexpressors were late flowering under both short andlong day conditions, however, the late flowering phenotype appeared moreconsistently under short day conditions. The overexpressors were darkergreen in color compared to the wild-type controls at later stages ofdevelopment. This was inconsistent with the published phenotype, whichindicates the plants have less chlorophyll, and are pale in color(Schaffer et al. (1998) Cell 93: 1219-1229). Preliminary data indicatedthat a vernalization treatment applied to germinating seedlingspartially overcame the delay in flowering in the G680 overexpressors.Vemalized plants showed an approximate 35% reduction in leaf number onaverage compared to non-vemalized controls. Overexpression of G680 inplants also resulted in sensitivity to media containing high glucose ina germination assay, indicating a potential role for G680 in sugarsensing.

As determined by RT-PCR, G680 was uniformly expressed in all tissuestested. RT-PCR data also indicated a moderate induction of G680transcripts accumulation upon drought treatment, and Erysiphe treatmentcould repress the expression of this gene.

Utilities

G680 or its equivalogs may be used to alter sugar sensing in plants.Sugars are key regulatory molecules that affect diverse processes inhigher plants including germination, growth, flowering, senescence,sugar metabolism and photosynthesis. Sucrose is the major transport formof photosynthate and its flux through cells has been shown to affectgene expression and alter storage compound accumulation in seeds(source-sink relationships). Glucose-specific hexose-sensing has beendescribed in plants and implicated in cell division and repression of‘famine’ genes (photosynthetic or glyoxylate cycles). The potentialutilities of a gene involved in glucose-specific sugar sensing are toalter energy balance, photosynthetic rate, carbohydrate accumulation,biomass production, source-sink relationships, and senescence.

Potential utilities of G680 or its equivalogs also include theregulation of flowering time. An area in which late flowering might beuseful include crops where the vegetative portion of the plant is themarketable portion. In this case, it would be advantageous to prevent ordelay flowering in order to increase yield. Prevention of floweringwould also be useful in these same crops in order to prevent the spreadof transgenic pollen and/or to prevent seed set.

A vernalization treatment applied to germinating G680 seedlings willpartially overcome the delay in flowering in the G680 overexpressors.Vernalized plants showed an approximate 35% reduction in leaf number onaverage compared to non-vernalized controls. Various late floweringmutants are partially rescued by GA applications (Chandler et al. (1994)J. Exp. Bot. 45: 1279 1288). Thus it is possible that G680 could be usedto increase the vegetative phases of development in order to increaseyield and then triggered to flower via a cold treatment or a gibberellicacid application.

G682 (SEQ ID NO: 467)

Published Information

G682 was identified from the Arabidopsis BAC, AF007269, based onsequence similarity to other members of the Myb family within theconserved domain.

Experimental Observations

The function of G682 was analyzed through its ectopic overexpression inplants. G682 overexpressors were glabrous, had tufts of more root hairsand germinated better under heat stress conditions. Older plants werenot more tolerant to heat stress compared to wild-type controls.

RT-PCR analysis of the endogenous levels of G682 transcripts indicatedthat this gene was expressed in all tissues tested, however, a very lowlevel of transcript was detected in roots and shoots. Array tissue printdata indicated that G682 was expressed primarily, but not exclusively,in flower tissue.

An array experiment was performed on one G682 overexpressing line. Thedata from this one experiment indicated that this gene could be anegative regulator of chloroplast development and/or light dependentdevelopment because the gene Albino3 and many chloroplast genes arerepressed. Albino3 functions to regulate chloroplast development(Sundberg et al (1997) Plant Cell 9:717-730). The gene G682 was itselfinduced 20-fold. Other than a few additional transcription factors, veryfew genes are induced as a result of the ectopic expression of G682.

A number of plants transformed with G682 lacked trichomes.

Plants overexpressing paralogs of G682, including G225, G226, G11816,and G2718, transcription factors within the same clade, have similartraits as plants that overexpress G682. When any of these fivetranscription factors are overexpressed in plants, the plants showphenotypes related to glabrousness and abiotic stress tolerance,including:

-   -   (a) a reduced number or lack of trichomes by overexpressing        G682, G225, G226, G1816, and G2718;    -   (b) increased root hairs, the latter indicating improved        resistance to osmotic stress by overexpressing G682, G225, G226,        G1816, and G2718;    -   (c) increased tolerance to nitrogen-limiting conditions by        overexpressing G682, G225, G226, G1816 and G2718;    -   (d) increased heat tolerance by overexpressing G682 and G225;    -   (e) increased tolerance to salt, by overexpressing G226;    -   (f) altered sugar sensing, by plants overexpressing G1816;    -   (g) altered sugar sensing by plants overexpressing G1816 and        G2718; and    -   (h) improved root growth, by plants overexpressing G2718.

MYB transcription factors can be subdivided into two classes based onpolynucleotide sequence characteristics. The sequences of representativeClass I transcription factors (CITFs) and Class II transcription factors(CIITFs), the latter group being the one in which G682 is found, areshown in Table 11. The coordinates for the conserved domains of thetranscription factors used to prepare Table 12b may be found in Table 5,and appear as the underlined regions in Table 11. TABLE 11 Class Itranscription factors G247MRMTRDGKEHEYKKGLWTVEEDKILMDYVRTHGQGHWNRIAKKTGLKRCGKSCRLRWMNYLSPNVNRGNFTDQEEDLIIRLHKLLGNRWSLIAKRVPGRTDNQVKNYWNTHLSKKLGLGDHSTAVKAACGVESPPSMALITTTSSSHQEISGGKNSTLRFDTLVDESKLKPKSKLVHATPTDVEVAATVPNLFDTFWVLEDDFELSSLTMMDFTNGYCL G212MRIRRRDEKENQEYKKGLWTVEEDNILMDYVLNHGTGQWNRIVRKTGLKRCGKSCRLRWMNYLSPNVNKGNFTEQEEDLIIRLHKLLGNRWSLIAKRVPGRTDNQVKNYWNTHLSKKLVGDYSSAVKTTGEDDDSPPSLFITAATPSSCHHQQENIYENIAKSFNGVVSASYEDKPKQELAQKDVLMATTNDPSHYYGNNALWVHDDDFELSSLVMMNFASGDVEYCL G676MRKKVSSSGDEGNNEYKKGLWTVEDDKILMDYVKAHGKGHWNRIAKKTGLKRCGKSCRLRWMNYLSPNVKRGNFTEQEEDLIIRLHKLLGNRWSLIAKRVPGRTDNQVKNYWNTHLSKKLGIKDQKTKQSNGDIVYQINLPNPTETSEETKISNIVDNNNILGDEIQEDHQGSNYLSSLWVHEDEFELSTLTNMMDFI DGHCFG1332 MECKREEGKSYVKRGLWKPEEDMILKSYVETHGEGNWADISRRSGLKRGGKSCRLRWKNYLRPNIKRGSMSPQEQDLIIRMHKLLGNRWSLIAGRLPGRTDNEVKNYWNTHLNKKPNSRKQNAPESIVGATPFTDKPVMSTELRRSHGEGGEEESNTWMEETNHFGYDVHVGSPLPISHYPDNTLVFDPCFSFTDFFP LLClass II transcription factors G225MFRSDKAEKMDKRRRRQSKAKASCSEEVSSIEWEAVKMSEEEEDLISRMYKLVGDRWELIAGRIPGRTPEEIERYWLMKHGVVFANRRRDFFRK G226MDNTNRLRLRRGPSLRQTKFTRSRYDSEEVSSIEWEFISMTEQEEDLISRMYRLVGNRWDLIAGRVVGRKANEIERYWIMRNSDYFSHKRRRLNNSPFFSTSPLNLQENLKL G682MDNHRRTKQPKTNSIVTSSSEEVSSLEWEVVNMSQEEEDLVSRMHKLVGDRWELIAGRIPGRTAGEIERFWVMKN G1816MDNTDRRRRRKQHKIALHDSEVSSIEWEFINMTEQEEDLIFRMYRLVGDRWDLIAGRVPGRQPEEIERYWIMRNSEGFADKRRQLHSSSHKHTKPHRPRFSIYPS G2718MNTQRKSKHLKTNPTIVASSSEEVSSLEWEEIAMAQEEEDLICRMYKLVGERWDLIAGRIPGRTAEEIERFWVMKNHRRSQLR

Table 12a shows the percent identity in the MYB conserved domains forthe sequences included in the Sequence Listing, as originally calculatedin U.S. patent application Ser. No. 09/489,376. Table 12b shows thepercent identity in the conserved domains for the sequences included inthe Sequence Listing, as calculated using revised amino acid residuecoordinates for the conserved domains (Table 5, and below) of therespective polypeptides, and also includes percent identity comparisonswith G11816 and G2718. TABLE 12a GID SEQ ID No. NO: G212 G247 G676 G1332G682 G226 G225 G212 124 100%  42% 41% 31% 36% 40% 36% G247 170 42% 100% 92% 74% 16% 17% 15% G676 458 41% 92% 100%  68% 15% 17% 15% G1332 856 31%74% 68% 100%  18% 18% 18% G682 468 36% 16% 15% 18% 100%  56% 81% G226142 40% 17% 17% 18% 56% 100%  65% G225 140 36% 15% 15% 18% 81% 65% 100% 

TABLE 12b GID SEQ ID No. NO: G212 G247 G676 G1332 G682 G226 G225 G1816G2718 G212 124 100%  91% 91% 69% 52% 60% 57% 60% 54% G247 170 91% 100% 95% 70% 52% 57% 55% 57% 55% G676 458 91% 95% 100%  70% 52% 60% 57% 60%54% G1332 856 69% 70% 70% 100%  60% 62% 64% 57% 57% G682 468 52% 52% 52%60% 100%  63% 82% 65% 78% G226 142 60% 57% 60% 62% 63% 100%  70% 82% 70%G225 140 57% 55% 57% 64% 82% 70% 100%  78% 78% G1816 940 60% 57% 60% 57%65% 82% 78% 100%  73% G2718 960 54% 55% 54% 57% 78% 70% 78% 73% 100% 

Based on the analyses using the conserved domains shown in Table 11, theCITFs G212, G247, G676, and G1332 show a high degree of sequenceidentity in the MYB domain with each other (Table 12b; in the range of69-95%), and CIITFs G682, G225, G226, G1816 and G2718 share a highdegree of sequence identity with each other (Table 12b; in the range of63% to 82%). Comparisons between CITFs and CIITFs reveals a generallylower degree of identity between transcription factors in the differentgroups (Table 12b; 52% to 64%).

Utilities

The potential utility of this gene or its equivalogs is to confer heattolerance to germinating seeds.

G682 or its equivalogs could be used to alter trichome number anddistribution in plants. Trichome glands on the surface of many higherplants produce and secrete exudates, which give protection from theelements and pests such as insects, microbes and herbivores. Theseexudates may physically immobilize insects and spores, may beinsecticidal or ant-microbial or they may allergens or irritants toprotect against herbivores. Trichomes have also been suggested todecrease transpiration by decreasing leaf surface air flow, and byexuding chemicals that protect the leaf from the sun.

G736 (SEQ ID NO: 487)

Published Information

G736 was discovered as a full length EST clone. It was subsequentlylocalized to BAC AC002341.

Experimental Observations

RT-PCR analysis of the endogenous levels of G736 indicated that thisgene was expressed at low to medium levels in all tissues tested. Inaddition, there was no induction of G736 above its basal level inresponse to environmental stress treatments.

Two out of three G736 overexpressing lines exhibited a severe lateflowering phenotype in both the T1 and T2 generation, the third line waslate flowering in the TI generation but the phenotype was lost in thesubsequent generation, most likely due to silencing of the transgene.All three lines exhibited elongated petioles in both generations, and intwo of the Ti lines, failure of the siliques to elongate was alsoobserved. This phenotype was lost in the subsequent generation.

Utilities

Overexpression of G736 and its equivalog may be used to substantiallydelay flowering. A wide variety of applications exist for genes thateither lengthen or shorten the time to flowering, or for systems ofinducible flowering time control. In particular, in species where thevegetative parts of the plants constitute the crop and the reproductivetissues are discarded, it would be advantageous to delay or preventflowering. Extending vegetative development could bring about largeincreases in yields. Additionally, a major concern is the escape oftransgenic pollen from GMOs to wild species or so-called organic crops.Systems that prevent vegetative transgenic crops from flowering wouldeliminate this worry.

G748 (SEQ ID NO: 497)

Published Information

A cDNA sequence for G748 was deposited in GenBank by Abbaraju and Oliveron Aug. 4, 1998. G748 encodes a protein containing a Dof zinc-fingerdomain that was found to bind the H-protein promoter. The H protein is acomponent of the glycine decarboxylase multienzyme complex, thatcomprises over one-third of the soluble proteins in mitochondriaisolated from the leaves of C3 plants (Oliver et al. (1995) Bioenerg.Biomembr. 27: 407-414). A published function for G748 is a putativeregulatory role in H-protein gene expression, suggested by thepromoter-binding data.

Closely Related Genes from Other Species

Close relatives to G748 include a rice gene (GB accession # BAA88190)and a pumpkin gene (GB accession # D45066). In both cases, thesimilarity extends beyond the conserved DNA-binding domain, whichsuggests the genes could be orthologs of G748. The pumpkin gene encodesan ascorbate oxidase promoter-binding protein, suggesting that theproduct of G748 could also bind that promoter.

Experimental Observations

A cDNA sequence was isolated and used to produce transgenic plantsoverexpressing G748. Overexpression of G748 resulted in a late floweringphenotype. Transgenic plants were generally large and dark green withmore rosette leaves. Stems were thicker and more vascular bundles werenoticeable in transverse sections. G748 overexpressors also producedmore lutein in seeds (consistently observed in three lines). The highlutein phenotype was confirmed in a repeat experiment. The physiology ofthe plant was similar to that of the controls. In wild-type plants, G748was constitutively expressed, although at lower levels at the seedlingstage. Expression levels were lower upon infection with E. orontii andFusarium.

Utilities

Experimental data showed that G748 or its equivalogs can be used todelay flowering in transgenic plants.

Arabidopsis plants overexpressing G748 produced more lutein in seeds.

Plants transformed with G748 had modified stem morphology and vascularbundles and may be used to affect overall plant architecture.

G779 (SEQ ID NO: 525)

Published Information

G779 has been previously identified; fruits from a indl knockout mutantplants do not show cell differentiation in the dehiscence zone(Liljegren et al. (2000) Abstracts 11th Intl. Conf. Arabidopsis Res.,Madison, Wis., pp. 179). These results suggest that G779 may mediatecell differentiation during Arabidopsis fruit development.

Closely Related Genes from Other Species

G779 is closely related to a Brassica rapa subsp. Pekinensis cDNAisolated from flower bud (acc#AT002234).

Experimental Observations

The function of G779 was analyzed using transgenic plants in which G779was expressed under the control of the 35S promoter. Morphologicalanalysis of overexpressors indicated that primary transformants of G779had high levels of anthocyanin in seedlings, produced small plants withdisorganized rosettes and short internodes, and many had flowerabnormalities. The transformants with flower abnormalities showedconversion of sepals to carpels. The most severely affected had fullconversion of sepals to carpels with ovules, stigmatic tissue on petalsand stamens, and in some cases showed organ fusions. In the severe caseof one T1 line, some inflorescences showed no flowers at all. Plantswith a weak phenotype showed only small patches of stigmatic tissue onsepals. The floral phenotypes decreased acropetally. The plants showingthe strongest phenotypes were essentially sterile, and did not produceT2 progeny for further analysis.

The phenotype produced by overexpressing G779 and G1499 was similar inthe aspects of flower structures. Cluster analysis using basichelix-loop-helix motif revealed that both proteins of G779 and G1499 areclosely related. The fact that expression of G779 was induced by auxintreatment in the rosette leaves indicates that G779 may play some kindof role in the auxin signal transduction pathway.

Utilities

G779 or its equivalogs could be used to modify plant architecture anddevelopment, including flower structure. If expressed under aflower-specific promoter, it might also be useful for engineering malesterility. Because expression of G779 is flower, embryo and siliquespecific, its promoter could be useful for targeted gene expression inthese organs.

G789 (SEQ ID NO: 539)

Published Information

A partial sequence of G789 was identified from an EST clone (GenBankaccession number T41998).

Experimental Observations

G789 was initially identified as a public EST (GenBank accession numberT41998) and subsequently a full length library clone was identified. Thefunction of G789 was analyzed using transgenic plants in which G789 wasexpressed under the control of the 35S promoter.

Overexpression of G789 reduced the time to flowering under continuouslight conditions; this phenotype was most prevalent in the T2 generationand was noted in all three of the lines analyzed.

Transgenic plants overexpressing G789 were more sensitive to theherbicides glyphosate and acifluorfen and to oxidative stress caused byrose bengal compared to wild-type controls. Furthermore, G789overexpressing lines were more susceptible to infection with Sclerotiniasclerotiorum when tested as mixed lines in two repeat experiments. Thisdisease susceptibility phenotype did not repeat when individual lineswere tested. It is well known that oxidative stress is a component of aplant defense response to pathogen and therefore, the diseasesusceptibility phenotype could thus be related to a general sensitivityto oxidative stress.

Based on the RT-PCR analysis, G789 was constitutively expressed in alltissues; its expression level was unaffected by any of the conditionstested.

Utilities

Based on the current analysis of G789 overexpressing plants, G789 or itsequivalogs could be used to manipulate flowering time.

Since G789 activity has been shown to be required for the protection ofArabidopsis plants against oxidative stress, G789 or its equivalogscould be used to manipulate defenses against abiotic and biotic stressessuch as disease, UV-B radiation, ozone pollution and herbicideapplication.

G801 (SEQ ID NO: 549)

Published Information

A partial sequence for G801 was identified from EST clones (GenBankaccession numbers N97289, H36373 and Z32574).

Experimental Observations

G801 is a proprietary sequence initially identified as three partialpublic ESTs (GenBank accession numbers N97289, H36373 and Z32574).Subsequently, a full length library clone was identified. The functionof G801 was analyzed using transgenic plants in which G801 was expressedunder the control of the 35S promoter. Morphological analysis revealedthat aminority of primary transformants of G801 were dark green and lateflowering. However, T2 lines derived from three lateflowering linesshowed no flowering time differences from control plants. Plantoverexpressing G801 showed more seedling vigor when germinated on mediacontaining high salt compared to wild-type control plants. All threeoverexpressing lines showed similar degrees of tolerance. In addition,overexpression of G801 in Arabidopsis resulted in an increase in seedoil content. This phenotype was observed in a single line.

Utilities

The potential utilities of this gene or its equivalogs include theability to confer salt tolerance during the germination stage of a cropplant. This would most likely impact survivability and yield.Evaporation of water from the soil surface causes upward water movementand salt accumulation in the upper soil layer, where the seeds areplaced. Thus, germination normally takes place at a salt concentrationmuch higher than the mean salt concentration in the whole soil profile.

In addition, G801 or its equivalogs may be used to increase seed oil incrop plants.

G849 (SEQ ID NO: 565)

Published Information

The transcription factor G849 is an Arabidopsis homolog of parsleyBPF-1, a pathogen inducible DNA-binding protein. BPF-1, Box-P BindingFactor 1, was reported by da Costa e Silva et al. ((1993) Plant Journal4:125-135) to bind specifically to the P-box sequence motif of thephenylalanine ammonia lyase promoter, a key enzyme of thephenylpropanoid metabolism. G849 is found in the sequence of chromosome3, BAC T2E22 (GenBank AC069474.4 GI: 12321944), released by theArabidopsis Genome Initiative. The start and stop codons were correctlypredicted.

Experimental Observations

NIR analyses performed on G849 knockout plants revealed increased totalcombined seed oil and protein content.

RT-PCR analysis of the endogenous level of G849 transcripts revealedhigh constitutive expression in all tissues examined, with the exceptionof germinated seed. A detectable but low level of G849 transcripts wasobserved in germinated seeds. G849 transcript level increasedsignificantly upon auxin, ABA, cold, heat and salt treatment, as well asseven days post-inoculation with Erysiphe orontii.

Utilities

Based on the knockout analyses, G849 or its equivalogs may be used tomodify seed oil and protein content.

The null mutant of G849 had altered seed phytosterol composition, adecease in beta-sitosterol, as well as changes in leaf insoluble sugars.Phytosterols are an important source of precursors for the manufactureof human steroid hormones by semisynthesis. Sitosterols andstigmasterols, not campesterol, are the preferred sources from seedcrops. Phytosterols and their hydrogenated derivatives phytostanols alsohave proven cholesterol-lowering properties.

G859 (SEQ ID NO: 567)

Published Information

G859 corresponds to MXK3.30 (BAB10332). The high level of sequencesimilarity between G859 and FLOWERING LOCUS C (FLC; Michaels et al.(1999) Plant Cell 11, 949-956; Sheldon et al., (1999) Plant Cell 11,445-458) has been described previously (Ratcliffe et al. (2001) PlantPhysiol. 126:122-132). G859 has also been referred to as AGL31(Alvarez-Buylla et al. (2000) Plant J. 24:457466).

Experimental Observations

G859 was recognized as a gene highly related to Arabidopsis FLC, and toMADS AFFECTING FLOWERING 1. FLC acts as a repressor of flowering(Michaels (1999) Plant Cell 11, 949-956; Sheldon et al. (1999) PlantCell 11, 445-458). Similarly, G157/MAF1 can cause a delay in floweringtime when overexpressed (Ratcliffe et al. (2001) Plant Physiol.126:122-132).

The function of G859 was studied using transgenic plants in which thisgene was expressed under the control of the 35S promoter. Overexpressionof G859 modified the timing of flowering, with very high levels of G859activity delaying the floral transition in the Columbia ecotype. Noalterations were detected in 35S::G859 plants in the physiological andbiochemical analyses that were performed.

Under continuous light conditions, the majority of 35S::G859 primarytransformants (overexpressing a construct containing a full-length cDNA,P1688) were earlier flowering than wild-type controls. This result wasobserved in multiple independent batches of T1 plants and in eithercontinuous or 12 hour light conditions. However, in each selection ofprimary transformants, a small number of lines were late flowering.RT-PCR analyses demonstrated that all T1 plants overexpressed thetransgene, but that the highest levels of expression were found in thelate flowering transformants. Comparable results were also obtained whenplants were transformed with a construct (P376) containing a shortersplice-variant of G859. The effects on flowering time caused byoverexpression of G859, and the dependence of those effects on thetransgene expression levels, mirror results previously obtained forG157/MAF1 (Ratcliffe et al. (2001) Plant Physiol. 126:122-132).

Seed was taken for T2 analyses from two late flowering primarytransformants, and a T1 plant that had been early flowering. The progenyof the former two lines all appeared markedly late flowering, while theT2 plants from the third line were marginally late flowering. Noconvincing early flowering was observed in any the three T2 populations.Thus, in the second generation, the predominant effect of G859 activitywas delayed flowering. In a follow-up experiment it was found that lateflowering 35S::G859 T2 plants were photoperiod responsive, and were notsensitive to extensive vernalization treatments.

Utilities

G859 or its equivalogs could be used to alter flowering time.

G864 (SEQ ID NO: 573)

Published Information

G864 was identified in an Arabidopsis EST (H37693). G864 appears as geneAT4g23750 in the annotated sequence of Arabidopsis chromosome 4(AL161560).

Experimental Observations

G864 was discovered and initially identified as a public ArabidopsisEST.

The complete sequence of G864 was determined, and G864 was found to berelated to two additional Arabidopsis AP2/EREBP genes, G1421 and G1755..The function of G864 was analyzed using transgenic plants in which thisgene was expressed under the control of the 35S promoter. G864overexpressing plants exhibited a variety of phenotypic alterations.They were smaller than wild-type plants, and those with the strongestphenotypes were classified as dwarf. However, G864 overexpressing linesshowed more seedling vigor in a heat stress tolerance germination assaycompared to wild-type controls. Conversely, G864 overexpressing lineswere also somewhat more sensitive to chilling. One of the three T2 linesanalyzed showed significant increase in fucose and arabinose levels inleaves.

G864 was ubiquitously expressed, and was not significantly induced underany of the conditions tested.

Utilities

The germination of many crops is very sensitive to temperature. A genethat would enhance germination in hot conditions such as G864 or itsequivalogs would be useful for crops that are planted late in the seasonor in hot climates.

G867 (SEQ ID NO: 579)

Published Information

G867 corresponds to RAV1 (Kagaya et al. (1999) Nucleic Acids Res. 27:470-478). G867/RAV1 belongs to a small subgroup within the AP2/EREBPfamily of transcription factors, whose distinguishing characteristic isthat its members contain a second DNA-binding domain, in addition to theconserved AP2 domain, that is related to the B3 domain of VP1/AB13(Kagaya et al. (1999) supra). It has been shown that the two DNA-bindingdomains of RAV1 can separately recognize each of two motifs thatconstitute a bipartite binding sequence and together cooperativelyenhance its DNA-binding affinity and specificity (Kagaya et al. (1999)supra).

Experimental Observations

G867 was discovered and initially identified as a public ArabidopsisEST. G867 appeared to be constitutively expressed at medium levels.

G867 was first characterized using a line that contained a T-DNAinsertion in the gene. The insertion in that line resided immediatelydownstream of the conserved AP2 domain, and would therefore be expectedto result in a severe or null mutation. G867 knockout mutant plants didnot show significant changes in overall plant morphology, significantdifferences between these plants and control plants have not beendetected in any of the assays that have been performed so far.

Subsequently, the function of G867 was analyzed using transgenic plantsin which this gene was expressed under the control of the 35S promoter.G867 overexpressing lines were morphologically wild-type and nophenotypic alterations in G867 overexpressing lines were detected in thebiochemical assays that were performed. However, G867 overexpressinglines showed increased seedling vigor (manifested by increased expansionof the cotyledons) in germination assays on both high salt and highsucrose containing media, compared to wild-type controls.

The Arabidopsis paralogs G1930 (SEQ ID NO: 1893) and G9 (SEQ ID NO: 14)also showed stress related phenotypes. G9 exhibited increased rootbiomass, and thus could be used to produce better plant growth underadverse osmotic conditions. Genetic and physiological evidence indicatesthat roots subjected to various stresses, including water deficit, alterthe export of specific compounds, such as ACC and ABA, to the shoot, viathe xylem Bradford et al. (1980) Plant Physiol. 65: 322-326; Schurr etal. (1992) Plant Cell Environ. 15, 561-567).

G1930 plants responded to high NaCl and high sucrose on plates with moreseedling vigor, and root biomass compared to wild-type control plants;this phenotype was identical to that seen in 35S::G867 lines. Theseresults indicate a general involvement of this clade in abiotic stressresponses: The polypeptide sequences of G1930 and G9 share 72% (249/345residues) and 64% (233/364 residues) with G867, respectively. Theconserved domains of G1930 and G9 are 86% (56/65 residues) and 86%(56/65 residues) identical with the conserved domain of G867,respectively.

Utilities

G867 or its equivalogs could be used to increase or facilitate seedgermination and seedling growth under adverse environmental conditions,in particular salt stress.

G867 or its equivalogs may also be used to modify sugar sensing.

G869 (SEQ ID NO: 581)

Published Information

A partial cDNA sequence of G869 is available as public ESTs N65486. Thesequence of G869 later appeared among the Arabidopsis sequences releasedby the Arabidopsis Genome Initiative, in BAC T26J14 (GenBank accessionnumber AC011915).

Experimental Observations

The complete cDNA sequence of G869 was determined. The function of thisgene was analyzed using transgenic plants in which G869 was expressedunder the control of the 35S promoter. Plants overexpressing G869 weresmall with spindly bolts. G869 transgenic plants showed alterations inleaf and seed fatty acid composition. In leaves, 16:0 levels decreasedand 16:3 levels increased. These changes likely reflected alterations inthe desaturation state of chloroplast membranes. In seeds, 18:1 levelsincreased significantly. The increase in the seed 18:1 fatty acid in twolines was observed in a repeat experiment. A decrease in 18:3 and 20:0was also noted in these lines.

Alterations in the levels of leaf insoluble sugars were also detected,with the increase in fucose determined to be significant. In addition,G869 overexpressors were more tolerant to infection with a moderate doseof the fungal pathogen Erysiphe orontii. The increase in resistancephenotype cosegregated with the dwarf phenotype. G869 plants showedadditional morphological alterations, including poor fertility due tounderdeveloped anthers.

Utilities

G869 or its equivalogs could be useful to manipulate the saturationlevels of lipids in seeds. Alteration in seed lipid saturation could beused to improve the heat stability of oils or to improve the nutritionalquality of seed oil.

As G869 transgenic plants have an altered response to the fungalpathogen Erysiphe orontii, G869 or its equivalogs could be used tomanipulate the defense response in order to generate pathogen-resistantplants.

G877 (SEQ ID NO: 583)

Published Information

G877 was identified in anArabidopsis EST (N37131). G877 is contained inP1 clone MXK23 (GenBank accession number AB026656).

Closely Related Genes from Other Species

A non-Arabidopsis gene closely related to G877 is the tobacco geneNtWRKY4 (GenBank accession number AB026890). Similarity between thesetwo genes extends beyond the conserved WRKY domain.

Experimental Observations

G877 was first discovered and identified as a public Arabidopsis EST.The complete sequence of G877 was determined.

A line was identified that contains a T-DNA insertion in the codingsequence of G877. The insertion likely resulted in a null mutation,since it resided upstream of the conserved WRKY domain sequence. Plantsthat were hemizygous for that insertion segregate 3 viable: 1 nonviableseeds in the silique, and homozygous G877 knockout mutant plants werenever obtained. Therefore, a (null) mutation in G877 results in embryolethality.

G877 was ubiquitously expressed. G877 is likely to be involved incontrolling some essential process(es) required for growth rather thanspecific aspects of embryo patterning and development. Alternatively,G877 might play different roles throughout the plant life cycle.

Utilities

The embryo lethal phenotype of a G877 mutation indicates that the geneis involved in the control of some essential aspect of growth anddevelopment. G877 or its equivalogs could therefore constitute anherbicide target, either by itself or by allowing the identification ofother genes or processes essential for plant growth.

G881 (SEQ ID NO: 587)

Published Information

G881 corresponds to gene F28M20.10, first identified in the sequence ofBAC clone F28M20 (released by the Arabidopsis Genome Initiative; GenBankaccession number AL031004).

Experimental Observations

The complete cDNA sequence for G881 was determined. The annotation inGenBank for this gene (BAC AL031004) was found to be inaccurate. G881was ubiquitously expressed, but appeared to be significantly induced inresponse to salicylic acid treatment. The function of this gene wasanalyzed using transgenic plants in which G881 was expressed under thecontrol of the 35S promoter. G881 overexpressors appeared to be moresusceptible to infection with a moderate dose of the fungal pathogenErysiphe orontii. Increased susceptibility to Erysiphe orontii wasconfirmed in repeat experiment. The induction of G881 expression by SAalso implicated G881 in the disease response.

Utilities

Since G881 transgenic plants appear to have an altered response to thefungal pathogen Erysiphe orontii, G881 or its equivalogs could be usedto manipulate the defense response in order to generatepathogen-resistant plants.

G896 (SEQ ID NO: 595)

Published Information

G896 was identified in the sequence of BAC T7123, GenBank accessionnumber U89959, released by the Arabidopsis Genome Initiative. Part ofthe G896 sequence was first identified as an MSU EST (T45249). There isno other published or public information about G896

Closely Related Genes from Other Species

G896 is very similar to a peppermint EST (AW255156). Since the homologyextends beyond the conserved domain, G896 and the mint gene are likelyorthologs.

Experimental Observations

A knock-out mutant was isolated, which contains a T-DNA insertion 40base pairs downstream of the start codon. G896 knock-out plants weremore susceptible to Fusarium oxysporum. In addition, G896 knockoutplants had lower levels of lutein in seeds as compared to wild-typecontrol plants. Otherwise, the knock-out plants had a wild-typemorphological phenotype.

In wild-type plants, G896 was mostly expressed in roots. Changes inenvironmental conditions did not affect its expression.

Utilities

Since G896 transgenic plants have an altered response to the fungalpathogen Fusarium oxysporum, the gene or its equivalogs could be used tomanipulate the defense response in order to generate pathogen-resistantplants.

G911 (SEQ ID NO: 613)

Closely Related Genes from Other Species

An EST (GenBank accession A1352907) induced in the defense response ofBrassica napus to Leptosphaeria maculans has extremely high homologyboth within and external to the conserved RING H2 domain.

Experimental Observations

The function of G9 11 was analyzed through its ectopic overexpression inArabidopsis. RT-PCR of endogenous levels of G911 indicated this gene wasexpressed in all tissues tested. A cDNA array experiment confirmed thistissue distribution data by RT-PCR. Microarray data confirmed that G911was overexpressed 23 fold. Other genes that were induced when G911 wasoverexpressed included RHA1b (another RING C2H3C2 transcription factor),pistilata, and a proline rich protein isolog. Plants overexpressing G911looked healthier and had longer roots when grown on media lackingpotassium compared to wild-type plants.

Utilities

Plants overexpressing G911 or its equivalogs may be able to be grownwith fertilizer lacking or containing low potassium.

G912 (SEQ ID NO: 615)

Published Information

G912 was identified in the sequence of P1 clone MSG15 (GenBank accessionnumber AB015478; gene MSG15.6).

Closely Related Genes from Other Species

G912 is closely related to CBF1, CBF2, and CBF3, and also closelyrelated to the members of the CBF-like subgroup of AP2/EREBP proteinsfrom other plants, like AF084185 Brassica napus dehydration responsiveelement binding protein.

Experimental Observations

G912 was recognized as the AP2/EREBP gene most closely related toArabidopsis CBF1, CBF2, and CBF3 (Stockinger et al (1997) Proc. Natl.Acad. Sci. USA 94:1035-1040; Gilmour et al.(1998) Plant J. 16:433-442).In fact, G912 is the only other AP2/EREBP transcription factor for whichsequence similarity with CBF1, CBF 2, and CBF3 extends beyond theconserved AP2 domain.

The function of G912 was studied using transgenic plants in which thisgene was expressed under the control of the 35S promoter. Plantsoverexpressing G912 were more freezing and drought tolerant than thewild-type controls, but were also small, dark green, and late flowering.There was a positive correlation between the degree of growth impairmentand the freezing tolerance. In addition, G912 expression appeared to beinduced by cold, drought, and osmotic stress.

These results mirror the extensive body of work that has shown thatCBF1, CBF2, and CBF3 are involved in the control of the low-temperatureresponse in Arabidopsis, and that those genes can be used to improvefreezing, drought, and salt tolerance in plants (Stockinger et al.,(1997) Proc. Natl. Acad. Sci. USA 94:1035-1040; Gilmour et al. (1998)Plant J. 16:433-442; Jaglo-Ottosen et al. (1998) Science. 280:104-106;Liu et al. (1998) Plant Cell. 10:1391-1406, Kasuga et al. (1999) Nat.Biotechnol. 17:287291).

The polypeptide sequences of G40, G41, and G42 share 71% (140 of 195residues), 68% (144 of 211 residues), and 65% (147 of 224 residues)identity with G912, respectively. The conserved domains of G40, G41, andG42 share 94% (64 of 68 residues), 92% (63 of 68 residues), and 94% (64of 68 residues) identity with G912, respectively In addition, G912overexpressing plants also exhibited a sugar sensing phenotype: reducedseedling vigor and cotyledon expansion upon germination on high glucosemedia.

Utilities

G912 or its equivalogs could be used to improve plant tolerance to cold,freezing, drought, and salt stress.

G961 (SEQ ID NO: 633)

Published Information

G961 was first identified in the sequence of the BAC clone F19D11,GenBank accession number AC0053 10, released by the Arabidopsis GenomeInitiative.

Closely Related Genes from Other Species

The most related gene to G961 is a rice gene in accession numberBAA84803.

Experimental Observations

The full length sequence of G961 was experimentally confirmed. Thefunction of this gene was analyzed by knockout analysis. Plantshomozygous for a T-DNA insertion in G961 were wild-type for all assaysperformed.

Gene expression profiling by RT-PCR showed that G961 was primarilyexpressed in shoots, embryos and siliques at medium levels, and at lowlevels in flowers. RT-PCR data also indicated an induction of G961transcripts accumulation upon heat treatment.

G961 knockout mutants were found to have altered seed oil content ascompared to wild-type plants.

Utilities

G961 or its equivalog knockout mutants may be used to alter seed oilcontent in plants, which may be very important for the nutritional valueand production of various food products.

G971 (SEQ ID NO: 639)

Published Information

G971 corresponds to gene F28P10.30 (CAB41085).

Experimental Observations

The function of G971 was studied using transgenic plants in which thegene was expressed under the control of the 35S promoter.

Overexpression of G971 produced a marked delay in the transition toflowering. The effect was noted, to varying extents, in approximatelyhalf of the 35S::G971 primary transformants. These plants floweredbetween one and three weeks later than controls under continuous lightconditions. At later stages, most of the plants also appeared darkergreen and developed larger leaves than controls. Two of the three T2populations selected for further study displayed a comparable, butrather more extreme late flowering phenotype to that seen in theparental plants. At early stages, seedlings from these two lines wererelatively small, but recovered as development progressed, andeventually became larger than wild type. No alterations were detected in35S::G971 plants in the physiological and biochemical analyses that wereperformed.

G971 was ubiquitously expressed and does not appear to be significantlyinduced by any of the conditions tested.

Utilities

G971 or its equivalogs could be used to modify flowering timecharacteristics. A wide variety of applications exist for systems thateither lengthen or shorten the time to flowering.

In species such as sugarbeet where the vegetative parts of the plantsconstitute the crop and the reproductive tissues are discarded, it wouldbe advantageous to delay or prevent flowering. Extending vegetativedevelopment could bring about large increases in yields.

G974 (SEQ ID NO: 641)

Published Information

G974 was first identified in a BAC-end sequence (B28553; partial G974sequence). G974 corresponds to gene F16L1.8 (BAC F16L1, AC024228).

Closely Related Genes from Other Species

Several AP2 proteins from a variety of species (Atriplex hortensis,Lycopersicon esculentum, Glycine max, Populus balsamifera, Medicagotruncatula) exhibited sequence similarity with G974 outside of thesignature AP2 domain sequence, and bear nearly identical AP2 domains.These proteins may be related.

Experimental Observations

The complete sequence of G974 was obtained and G974 was studied usingtransgenic plants in which G974 was expressed under the control of the35S promoter. Constitutive expression of G974 produced deleteriouseffects: the majority of 35S::G974 primary transformants showed areduction in overall size and developed rather slowly compared towild-type controls. These phenotypic alterations were not observed inthe T2 generation, perhaps indicating silencing of the transgene. The T2plants were wild-type in the physiological and biochemical analysesperformed. G974 was ubiquitously expressed.

35S::G974 overexpressors had altered seed oil content.

Utilities

G974 or its equivalogs may be used to alter seed oil content in plants,which may be very important for the nutritional value and production ofvarious food products.

G975 (SEQ ID NO: 643)

Published Information

G975 has appeared in the sequences released by the Arabidopsis GenomeInitiative (BAC F9L1, GenBank accession number AC007591).

Closely Related Genes from Other Species

The non-Arabidopsis gene most highly related to G975 is represented byL46408 BNAF1258 Mustard flower buds Brassica rapa cDNA clone F1258. Thesimilarity between G975 and the Brassica rapa gene represented by ESTL46408 extends beyond the conserved AP2 domain that characterizes theAP2/EREBP family. This Brassica rapa gene appeared to be more closelyrelated to G975 than Arabidopsis G1387, indicating that EST L46408 mayrepresent a true G975 ortholog. The similarity between G975 andArabidopsis G1387 also extends beyond the conserved AP2 domain.

Experimental Observations

G975 was identified as a new member of the AP2/EREBP family (EREBPsubfamily) of transcription factors. G975 was expressed in flowers and,at lower levels, in shoots, leaves, and siliques. GC-FID and GC-MSanalyses of leaves from G975 overexpressing plants showed that thelevels of C29, C31, and C33 alkanes were substantially increased (up toten-fold) compared with control plants. A number of additional compoundsof similar molecular weight, presumably also wax components, alsoaccumulated to significantly higher levels in G975 overexpressingplants. C29 alkanes constituted close to 50% of the wax content inwild-type plants (Millar et al. (1998) Plant Cell 11:1889-1902),suggesting that a major increase in total wax content occurred in theG975 transgenic plants. However, the transgenic plants had an almostnormal phenotype (although small morphological differences were detectedin leaf appearance), indicating that overexpression of G975 was notdeleterious to the plant. Overexpression of G975 did not cause thedramatic alterations in plant morphology that had been reported forArabidopsis plants in which the FATTY ACID ELONGATION1 gene wasoverexpressed (Millar et al. 1998, Plant Cell 11: 1889-1902). G975 mayregulate the expression of some of the genes involved in wax metabolism.One Arabidopsis AP2 sequence (G1387) that is significantly more closelyrelated to G975 than the rest of the members of the AP2/EREBP family ispredicted to have a function and a use related to that of G975.

Utilities

G975 or its equivalogs can be used to manipulate wax composition,amount, or distribution, which in turn can modify plant tolerance todrought and/or low humidity or resistance to insects, as well as plantappearance (shiny leaves).

G975 or its equivalogs can also be used to specifically alter waxcomposition, amount, or distribution in those plants and crops fromwhich wax is a valuable product.

G979 (SEQ ID NO: 649)

Published Information

G979 was first identified in a BAC-end sequence (B25031; partial G979sequence). G979 corresponds to gene T12E18_(—)20 (BAC T12E18, AL132971).No information is available about the function(s) of G979.

Experimental Observations

The complete sequence of G979 was obtained. The function of this genewas studied using both transgenic plants in which G979 was expressedunder the control of the 35S promoter (April 2001), and a line with aT-DNA insertion in the gene. G979 codes for an AP2 protein of the AP2subfamily, i.e., it contains two AP2 domains. The T-DNA insertion of theKO line lies in an intron, located in between the exons coding for thesecond AP2 domain of the protein, and is thus expected to result in astrong or null mutation. Whereas constitutive expression of G979produced deleterious effects, the analysis of G979 KO mutant plantsproved informative about the function of the gene. It was suggested thatproteins of the AP2 subfamily were more likely to be involved indevelopmental processes (Riechmann et al. (1998). Biol. Chem. 379:633-646). Fittingly, seeds homozygous for a T-DNA insertion within G979showed delayed ripening, slow germination, and developed into small,poorly fertile plants, indicating that G979 is involved in seeddevelopment processes.

The difficulty in initially isolating, from heterozygous plants, progenythat was homozygous for the T-DNA insertion raised the possibility thathomozygosity for that allele was lethal. Siliques of heterozygous plantswere examined for seed abnormalities. Approximately 25% of the seedscontained in young green siliques were pale in coloration. In older,brown siliques, approximately 25% of the seeds were green and appearedslow ripening, whereas the remaining seeds were brown. Thus, it seemedlikely that the seeds with altered development were homozygous for theT-DNA insertion, whereas the normal seeds were wild-type andheterozygous segregants.

Furthermore, it was observed that approximately 25% of the seed fromG979 knockout heterozygous plants showed impaired (delayed) germination.Upon germination, these seeds produced extremely tiny seedlings thatoften did not survive transplantation. A few small and sickly lookinghomozygous plants could be grown, which produced siliques that containedseeds that were small and wrinkled compared to wild type.

A second, different, T-DNA insertion allele for G979 was identified aspart of a TAIL PCR screen. Progeny of the heterozygous plant carryingthat T-DNA insertion was either wild-type or heterozygous for themutation, providing additional evidence for the disruption of G979 beingthe cause of the phenotypic alterations detected.

The initial analysis of the gene was performed using overexpressinglines. 35S::G979 transformants were generally smaller than wild type anddeveloped spindly inflorescences that carried abnormal flowers withcompromised fertility.

G979 expression was ubiquitous and not induced under any of theconditions tested.

Utilities

On the basis of the results obtained with G979 knockout mutant lines, itis possible that G979 or its equivalogs could be used to alter or modifyseed germination, ripening and development properties and performance.

G987 (SEQ ID NO: 653)

Published Information

The genomic sequence of G987 is located on the Arabidopsis BAC cloneT914 (gene T914.14) (GenBank accession number AC005315).

Experimental Observations

As determined by RT-PCR analysis, G987 was constitutively expressed inall tissues tested. A line homozygous for a T-DNA insertion in G987 wasused to determine the function of this gene. The TDNA insertion in G987was approximately 4% into the coding sequence of the gene, and thereforeis likely to result in a null mutation. G987 mutant plants could only begrown on sucrose-containing medium. Biochemical analyses of leaves fromG987 mutants grown on sucrose-containing medium indicate that themutants had reduced amounts of 16:3 fatty acids, the presence of twoxanthophylls which were not present in wild-type leaves, the presence ofgamma-tocopherol (which normally accumulates in seed tissue), andreduced levels of chlorophyll a and chlorophyll b.

Utilities

The low amount of 16:3 and dramatic reduction in chlorophyll indicatedthat the gene controls some aspect of thylakoid membrane development.G987 or its equivalogs may control proplastid to chloroplastdevelopment. This could be tested by measuring the expression of some ofthe genes (e.g. LHCP) that are associated with the transition fromproplastid to chloroplast. If this were the case, the gene or itsequivalogs may be useful for controlling the transition from proplastidto chromoplast in fruits and vegetables. There may also be someapplications where it would be desirable to change the expression of thegene or its equivalogs (e.g., prevent cotyledon greening in Brassicanapus or campestris to avoid green oil due to early frost).

G1052 (SEQ ID NO: 699)

Published Information

G1052 was identified in the sequence of BAC F9D24, GenBank accessionnumber AL137081, released by the Arabidopsis Genome Initiative.

Closely Related Genes from Other Species

G1052 is similar to a rice gene BAA96162. Homology between G1052 and therice gene extends beyond the conserved domain, thus the two genes may beorthologous.

Experimental Observations

The boundaries of G1052 in BAC AL137081 were experimentally determinedand the function of G11052 was analyzed using transgenic plants in whichthis gene was expressed under the control of the 35S promoter. Plantsoverexpressing G1052 exhibited a delay in flowering and typicallyproduced flower buds about one week later than controls in continuouslight conditions. Additionally, these plants had larger leaves and weregenerally more sturdy than wild type.

A line homozygous for a T-DNA insertion in G1052 was also used todetermine the function of this gene. The T-DNA insertion of G1052 wasapproximately one third of the way into the coding sequence of the geneand therefore is likely to result in a null mutation. A decrease in thepercentage of lutein and increase in the xanthophyll 1 fraction wasdetected in one line in two experiments.

Utilities

The flowering time phenotype associated with G1052 over-expressionindicates a utility for G1052 or its equivalogs as genes that can beused to manipulate flowering time in commercial plants. In addition, ifthe G1052 can not be transmitted through pollen, G1052 or its equivalogsmay be used as a tool for preventing transgenes from escaping fromtransgenic plants through pollen dispersal.

G1052 or its equivalogs could be used to manipulate seed prenyl lipidcomposition. Lutein is an important nutraceutical, since lutein-richdiets have been shown to help prevent age-related macular degeneration(ARMD), which is the leading cause of blindness in people over the ageof 65. In particular, consumption of dark green leafy vegetables hasbeen shown in clinical studies to reduce the risk of ARMD. In addition,lutein, like other xanthophylls such as zeaxanthin and violaxanthin, isan essential component in the protection of the plant against thedamaging effects of excessive light. Specifically, lutein contributes,directly or indirectly, to the rapid rise of nonphotochemical quenchingin plants exposed to high light. Crop plants engineered to containhigher levels of lutein could therefore have improved photoprotection,possibly leading to less oxidative damage and better growth under highlight.

G1062 (SEQ ID NO: 713)

Published Information

G1062 corresponds to gene MLJ15.14 (BAB01738.1).

Closely Related Genes from Other Species

G11062 protein shares extensive homology in the basic helix loop helixregion with a cDNA from developing stem Medicago truncatula (AW691174)as well as a tomato shoot/meristem Lycopersicon esculentum cDNA(BG123327).

Experimental Observations

G1062 is a proprietary sequence initially identified from a libraryclone. The function of G1062 was analyzed by knockout analysis. TheT-DNA insertion of G1062 was approximately 75% into the coding sequenceof the gene and therefore is likely to result in a null mutation.

Homozygotes for a T-DNA insertion in G1062 showed slow growth andproduced abnormal seeds. Knockout.G1062 plants displayed a longer leafplastochron than wild type. Both generated flower buds at the same time,but wild-type plants had produced 9-11 rosette leaves at that point,compared to only 5-9 rosette leaves in the mutant (24 hour light).Following bolting, KO.G1062 inflorescences developed more slowly andwere shorter than wild type. Knockout G1062 seeds appeared twisted andwrinkled in comparison to wild-type seed.

Physiological assays revealed that seedlings from a G 1062 knockoutmutant line have a light grown phenotype in the dark and were moreseverely stunted in an ethylene insensitivity assay when compared to thewild-type controls. This result indicated that G1062 may be involved inthe ethylene triple response pathway. It is well known that ethylene isinvolved in the seed ripening process and therefore, the abnormal seedphenotype could be related to a general sensitivity to ethylene signaltransduction pathway.

RT-PCR analysis indicated that the transcripts of G1062 werepredominantly accumulated in the reproductive tissues. Its expressionlevel appeared to be not affected by any treatments tested.

Utilities

G11062 or its equivalogs that alter seed shape are likely to provideornamental applications.

Since G1062 is involved in the ethylene triple response pathway, G1062could be used to manipulate seed or fruit ripening process, and toimprove seed or fruit quality.

G1069 (SEQ ID NO: 721)

Published Information

The sequence of G1069 was obtained from EU Arabidopsis sequencingproject, GenBank accession number Z97336, based on its sequencesimilarity within the conserved domain to other AT-Hook related proteinsin Arabidopsis.

Closely Related Genes from Other Species

G1069 protein shares a significant homology to a cDNA isolated fromLotus japonicus nodule library. Similarity between G1069 and the LotuscDNA extends beyond the signature motif of the family to a level thatwould suggest the genes are orthologous. Therefore the gene representedby EST AW720668 may have a function and/or utility similar to that ofG1069.

Experimental Observations

The sequence of G1069 was experimentally determined and the function ofG1069 was analyzed using transgenic plants in which G1069 was expressedunder the control of the 35S promoter.

Plants overexpressing G1069 showed changes in leaf architecture, reducedoverall plant size, and retarded progression through the life cycle.This is a common phenomenon for most transgenic plants in which AT-HOOKproteins are overexpressed if the gene is predominantly expressed inroot in the wild-type background. G1069 was predominantly expressed inroots, based on analysis of RT-PCR results. To minimize thesedetrimental effects, G1069 may be overexpressed under a tissue specificpromoter such as root- or leaf-specific promoter or under induciblepromoter.

One of G1069 overexpressing lines showed more tolerance to osmoticstress when they were germinated in high sucrose plates. This line alsoshowed insensitivity to ABA in a germination assay.

Utilities

The osmotic stress results indicate that G1069 could be used to alter aplant's response to water deficit conditions and, therefore, the gene orits equivalogs could be used to engineer plants with enhanced toleranceto drought, salt stress, and freezing.

G1069 affects ABA sensitivity, and thus when transformed into a plantthe gene or its equivalogs may diminish cold, drought, oxidative andother stress sensitivities, and also be used to alter plantarchitecture, and yield.

G1073 (SEQ ID NO: 723)

Published Information

G1073 has been identified in the sequence of a BAC clone from chromosome4 (BAC clone F23E12, gene F23E12.50, GenBank accession number AL022604),released by EU Arabidopsis Sequencing Project.

Closely Related Genes from Other Species

G1073 has similarity to Medicago truncatula cDNA clones (GenBankaccession number AW574000 and AW560824) and Glycine max cDNA clones(AW349284 and A1736668) in the database.

Experimental Observations

The function of G1073 was analyzed using transgenic plants in whichG1073 was expressed under the control of the 35S promoter. Transgenicplants overexpressing G1073 were substantially larger than wild-typecontrols, with at least a 60% increase in biomass. The increased mass of35S::G1073 transgenic plants was attributed to enlargement of multipleorgan types including leaves, stems, roots and floral organs. Petal sizein the 35S::G1073 lines was increased by 40-50% compared to wild typecontrols. Petal epidermal cells in those same lines were approximately25-30% larger than those of the control plants. Furthermore, 15-20% moreepidermal cells per petal were produced compared to wild type. Thus, atleast in petals, the increase in size was associated with an increase incell size as well as in cell number. Additionally, images from the stemcross-sections of 35S::G1073 plants revealed that cortical cells arelarge and that vascular bundles contained more cells in the phloem andxylem relative to wild type

Seed yield was increased compared to control plants. 5S::G1073 linesshowed an increase of at least 70% in seed yield. This increased seedproduction was associated with an increased number of siliques perplant, rather than seeds per silique.

Flowering of G1073 overexpressing plants was delayed. Leaves of G1073overexpressing plants were generally more serrated than those ofwild-type plants. Improved drought tolerance was observed in 35S::G1073transgenic lines.

Utilities

Transgenic plants overexpressing G1073 are large and late flowering withserrated leaves. Large size and late flowering produced as a result ofG1073 or equivalog overexpression would be extremely useful in cropswhere the vegetative portion of the plant is the marketable portion(often vegetative growth stops when plants make the transition toflowering). In this case, it would be advantageous to prevent or delayflowering with the use of this gene or its equivalogs in order toincrease yield (biomass). Prevention of flowering by this gene or itsequivalogs would be useful in these same crops in order to prevent thespread of transgenic pollen and/or to prevent seed set. This gene or itsequivalogs could also be used to manipulate leaf shape.

G1075 (SEQ ID NO: 725)

Published Information

The sequence of G1075 was obtained from the Arabidopsis genomesequencing project, GenBank accession number AC004667, based on itssequence similarity within the conserved domain to other ATHook relatedproteins in Arabidopsis.

Closely Related Genes from Other Species

G1075 is homologous to a Medicago truncatula cDNA clone (acc#AW574000

Experimental Observations

The function of G1075 was analyzed using transgenic plants in whichG1075 was expressed under the control of the 35S promoter.Overexpression of G1075 produced very small, sterile plants. Pointedleaves were noted in some seedlings, and twisted or curled leaves andabnormal leaf serrations were noted in rosette stage plants. Bolts wereshort and thin with short internodes. Flowers from severely affectedplants had reduced or absent petals and stamen filaments that partiallyor completely fail to elongate. Because of the severe phenotypes ofthese T1 plants, no T2 seed was produced for physiological andbiochemical analysis.

RT-PCR analysis indicated that G1075 transcripts are found primarily inroots. The expression of G1075 appeared to be induced by cold and heatstresses.

Utilities

G1075 or its equivalogs could be used to modify plant architecture anddevelopment, including flower structure. If expressed under aflower-specific promoter, the gene or its equivalogs might also beuseful for engineering male sterility. Because expression of G1075 isroot specific, its promoter could be useful for targeted gene expressionin this tissue.

G1089 (SEQ ID NO: 731)

Published Information

G1089 was initially identified as a gene represented by Arabidopsis ESTH37430. Subsequently, the entire sequence of G1089 was identified in BACF19K6, GenBank accession number AC037424, released by the Arabidopsisgenome initiative.

Closely Related Genes from Other Species

The most related gene to G1089 is a rice gene represented by NCBI entryg13124871. Similarity between G1089 and the rice gene extends beyond thesignature motif of the family to a level that would suggest the genesare orthologous. Therefore the gene represented by the rice gene mayhave a function and/or utility similar to that of G1089

Experimental Observations

The boundaries of G1089 were experimentally determined and the functionof G1089 was analyzed using transgenic plants in which this gene wasexpressed under the control of the 35S promoter. G1089 overexpressingplants had reduced seedling vigor and were characterized as being small,yellow and sickly looking. In addition, a T-DNA knockout of G1089 wasisolated. G1089 knockout mutant plants showed more tolerance to osmoticstress in a germination assay in two separate experiments. They showedmore seedling vigor than wild-type control when germinated on platescontaining high sucrose. G1089 appeared to be constitutively expressed.

Utilities

The osmotic stress results indicate that G1089 or its equivalogs couldbe used to alter a plant's response to water deficit conditions and,therefore, may be used to engineer plants with enhanced tolerance todrought, salt stress, and freezing.

G1134 (SEQ ID NO: 741)

Published Information

A partial sequence of G1134 was identified from an EST clone (GenBankaccession number A1099951).

Experimental Observations

A partial sequence of G1134 was identified from an EST clone (GenBankaccession number A1099951). The 5′ end of the G11134 coding sequence wasdetermined by RACE. The function of G1134 was analyzed using transgenicplants in which G1134 was expressed under the control of the 35Spromoter. Primary transformants of G1134 were small with strongly curledleaves. In the T2 generation, two lines had narrow, somewhat curledleaves and siliques with altered shape. A third line segregated forsmall size. Additionally, plants overexpressing G1134 showed an alteredresponse to the growth hormone ethylene. Seeds that were germinated onACC plates in the dark had longer hypocotyls than the correspondingcontrols and occasionally lacked the apical hook that is part of atypical ethylene triple response. In addition, seeds from all linesgerminated in the dark have a partial light grown phenotype in thattheir cotyledons are open and the hypocotyl is straight instead ofcurled.

The results from morphological and physiological analysis indicated thatG1134 protein may play important roles in the regulation of ethylenebiosynthesis, ethylene signal transduction pathways, orphotomorphogenesis. Analysis of G1134 overexpressors revealed noapparent biochemical changes when compared to wild-type control plants.Analysis of the endogenous expression level of G1134, as determined byRT-PCR, revealed that G1134 was predominantly expressed in flowertissues. Expression of G11134 was not induced by any of theenvironmental conditions or pathogens tested.

Utilities

G11134 or its equivalogs could be used to alter how plants respond toethylene and/or light. For example, it could be used to manipulate fruitripening.

G1198 (SEQ ID NO: 763)

Published Information

The entire sequence of G1198 was reported in BAC T23G18, accessionnumber AC011438, released by the Arabidopsis genome initiative.

Closely Related Genes from Other Species

G1198 is very similar to the tobacco bZIP transcription factor TGA2.2(accession number AF031487). Similarity extends well beyond theconserved domain, suggesting that G1198 and TGA2.2 have similarfunctions.

Experimental Observations

The boundaries of G1198 were experimentally determined and the functionof G1198 was analyzed using transgenic plants in which this gene wasexpressed under the control of the 35S promoter. G1198 overexpressingplants were reduced in size with smaller, narrower leaves and hadsignificantly increased levels of a glucosinolate as compared to wildtype. G1198 did not appear to be expressed in rosette leaves, but wasexpressed in other tissues.

G1198 overexpressing plants were found to have increased seed oilcontent, as compared to wild-type plants.

Utilities

G1198 or equivalog overexpression maybe used to alter seed oil contentin plants, which may be very important for the nutritional value andproduction of various food products. G1242 (SEQ ID NO: 787)

Published Information

The transcription regulator G1242 was identified by amino acid sequencesimilarity to proteins of the SWI/SNF family of chromatin remodelingfactor. G1242 is found in the sequence of the chromosome 5, BAC cloneT1A4 (GenBank accession number AC051627.4; nid=8027925), released by theArabidopsis Genome Initiative. No additional public information relatedto the functional characterization of G1242 is available.

Closely Related Genes from Other Species

Sequence comparison of G1242 with sequences available in GenBank revealsstrong similarity with plant proteins of several species, includingOryza sativa, Glycine max, Gossypium hirsutum, Sorghum bicolor, Zeamays, Solanum tuberosum and Hordeum vulgare.

Experimental Observations

The function of G1242 was analyzed using transgenic plants in which thisgene was expressed under the control of the 35S promoter. Morphologicalcharacterization of the G11242 overexpressing lines revealed a reductionin flowering time in continuous light conditions. This phenotype was oflow penetrance and was observed in a minority of lines.

RT-PCR analysis of the endogenous level of G1242 transcripts showed thatG1242 is expressed predominantly in Arabidopsis flowers and embryos. Adetectable, but low level of G1242 transcripts was observed in all othertissues. G1242 expression increased moderately upon SA treatment and mayhave been repressed by osmotic and cold treatments.

Utilities

G1242 overexpression appears to alter flowering time by accelerating thetransition from vegetative to reproductive state. G1242 or itsequivalogs could therefore be used to accelerate flowering time. Mostmodern crop varieties are the result of extensive breeding programs.Many generations of backcrossing may be required to introduce desiredtraits. Systems that accelerate flowering could have valuableapplications in such programs since they allow much faster generationtimes. Additionally, in some instances, a faster generation time mightallow additional harvests of a crop to be made within a given growingseason.

G1255 (SEQ ID NO: 793)

Published Information

G1255 was identified as a gene in the sequence of BAC AC079281, releasedby the Arabidopsis Genome Initiative.

Closely Related Genes from Other Species

G1255 showed strong homology to a putative rice zing finger proteinrepresented by sequence AC087181_(—)3. Sequence identity between thesetwo proteins extends beyond the conserved domain, and therefore, thesegenes can be orthologs.

Experimental Observations

The sequence of G1255 was experimentally determined and G1255 wasanalyzed using transgenic plants in which G1255 was expressed under thecontrol of the 35S promoter. Plants overexpressing G1255 had alterationsin leaf architecture, a reduction in apical dominance, an increase inseed size, and showed more disease symptoms following inoculation with alow dose of the fungal pathogen Botrytis cinerea. G1255 wasconstitutively expressed and not significantly induced by any conditionstested

Utilities

On the basis of the phenotypes produced by overexpression of G1255,G1255 or its equivalogs can be used to manipulate the plant's defenseresponse to produce pathogen resistance, alter plant architecture, oralter seed size.

G1266 (SEQ ID NO: 799)

Published Information

G1266 corresponds to ERF1, ‘ethylene response factor 1’ (GenBankaccession number AF076277) (Solano et al. (1998) Genes Dev. 12:3703-3714). ERF1 was isolated in a search for Arabidopsis EREBP-likegenes using a PCR-based approach. ERF1 expression was shown to berapidly induced by ethylene, and to be dependent on the presence offunctional EIN3 (ETHYLENE-INSENSITIVE3), as no expression was detectedin ein3-1 mutants (Solano et al. (1998) supra). Furthermore, ERF1 mRNAshowed constitutive high-level expression in 35S::EIN3-expressingtransgenic plants, and EIN3 was shown to bind to sequences in the ERF1promoter in a sequence-specific manner (Solano et al. (1998) supra). Allthese results indicated that ERF1 is downstream of EIN3 in the ethylenesignaling pathway, and that both proteins act sequentially in a cascadeof transcriptional regulation initiated by ethylene gas (Solano et al.(1998) supra). ERF1 binds specifically to the GCC element, which is aparticular type of ethylene response element that is found in thepromoters of genes induced upon pathogen attack (Solano et al., (1998)supra). 35S::ERF1-expressing transgenic plants displayed phenotypessimilar to those observed in the constitutive ethylene response mutantctrl or in wild-type plants exposed to ethylene; however, expression ofonly a partial seedling triple response in these lines indicated thatERF1 mediates only a subset of the ethylene responses (Solano et al.(1998) supra). At the adult stage, 35S::ERF1-expressing transgenicplants showed a dwarf phenotype, and some ethylene-inducible genes, likebasic-chitinase and PDF1.2 were constitutively activated in those lines(Solano et al. (1998) supra). All these results showed that ERF1 is adownstream ethylene signaling pathway gene.

Closely Related Genes from Other Species

The sequences of Nicotiana tabacum S25-XP1 (GenBank accession numberAAB38748) and G1266 are very similar, with similarity between the twoproteins extending beyond the conserved AP2 domain.

Experimental Observations

The function of G1266 was further analyzed using transgenic plants inwhich this gene was expressed under the control of the 35S promoter. Asexpected from the previously published work, G1266 overexpressing plantsshowed a dwarf phenotype. In physiological assays, it was shown thatG1266 overexpressing plants were more tolerant to infection with amoderate dose of the fungal pathogen Erysiphe orontii. The resistancephenotype to the fungal pathogen Erysiphe orontii has been repeated.This phenotype might be a consequence of ERF1 being a downstreamethylene signaling pathway gene. Constitutive expression of G1266 mightaccelerate leaf senescence, which in turn might impair infection byErysiphe orontii.

In addition, when analyzed for leaf insoluble sugar composition, threelines showed alterations in rhamnose, arabinose, xylose, and mannose,and galactose when compared with wild-type plants.

Utilities

G1266 has been shown to be a downstream ethylene signaling pathway gene,and experiments implicate this gene in the plant response to the fungalpathogen Erysiphe orontii. G1266 or its equivalogs could therefore beused to engineer plants with a modulated response to that and otherpathogens, for example, plants showing increased resistance.

G1274 (SEQ ID NO: 805)

Published Information

G1274 is a member of the WRKY family of transcription factors. The genecorresponds to WRKY51 (At5g64810). No information is available about thefunction(s) of G1274.

Experimental Observations

RT-PCR analysis was used to determine the endogenous expression patternof G1274. Expression of G1274 was detected in leaf, root and flowertissues. The biotic stress related conditions, Erysiphe and SA inducedexpression of G1274 in leaf tissue. The gene also appeared to beslightly induced by osmotic and cold stress treatments and perhaps byauxin.

The function of G1274 was studied using transgenic plants in which thegene was expressed under the control of the 35S promoter. G1274overexpressing lines were more tolerant to growth on low nitrogencontaining media. In an assay intended to determine whether thetransgene expression could alter C:N sensing, 35S::G1274 seedlingscontained less anthocyanins than wild-type controls grown on highsucrose/N— and high sucrose/N/Gln plates. These data together suggestedthat overexpression of G1274 may alter a plant's ability to modulatecarbon and/or nitrogen uptake and utilization.

In addition, 35S::G1274 transgenics were more tolerant to chillingcompared to the wild-type controls in both germination as well asseedling growth assays.

Overexpression of G1274 produced alterations in leaf morphology andinflorescence architecture. Four out of eighteen 35S::G1274 primarytransformants were slightly small and developed inflorescences that wereshort, and showed reduced internode elongation, leading to a bushier,more compact stature than in wild-type.

In an experiment using T2 populations, it was observed that the rosetteleaves from many of the plants were distinctly broad and appeared tohave a greater rosette biomass than in wild type.

Utilities

The enhanced performance of 35S::G1274 seedlings under chillingconditions suggested that the gene might be applied to engineer cropsthat show better growth under cold conditions. In particular,photo-inhibition of photosynthesis (disruption of photosynthesis due tohigh light intensities) often occurs under clear atmospheric conditionssubsequent to cold late summer/autumn nights. Chilling may also lead toyield losses and lower product quality due to poor pollination anddelayed ripening. Given that the growth of many crops is very sensitiveto cool temperatures, a gene that enhances growth under cool conditionscould enhance yields, extend the effective growth range of chillingsensitive crop species, and reduce fertilizer and herbicide usage.Another large impact of chilling occurs during post-harvest storage. Forexample, some fruits and vegetables do not store well at lowtemperatures (for example, bananas, avocados, melons, and tomatoes). Thenormal ripening process of the tomato is impaired if it is exposed tocool temperatures. Genes conferring resistance to chilling temperaturesmay enhance tolerance during post-harvest storage.

Chilling tolerance could also serve as a model for understanding howplants adapt to water deficit. Both chilling and water stress sharesimilar sensory transduction pathways and tolerance/adaptationmechanisms. For example, acclimation to chilling temperatures can beinduced by water stress or treatment with abscisic acid. Genes inducedby low temperature include dehydrins (or LEA proteins). Dehydrins arealso induced by salinity, abscisic acid, water stress or during the latestages of embryogenesis. Thus, genes that protect the plant againstchilling could also have a role in protection against water deficit

A secondary consequence of slow growth under cold conditions is poorground cover (of maize fields) in the spring; resulting in soil erosion,increased occurrence of weeds, and reduced uptake of nutrients. Aretarded uptake of mineral nitrogen can then lead to increased losses ofnitrate into the ground water.

The enhanced performance of G1274 overexpression lines under lownitrogen conditions indicated that the gene could be used to engineercrops that could thrive under conditions of reduced nitrogenavailability. Such a trait would provide cost savings to the farmer byreducing the amount of fertilizer needed provide the environmentalbenefit of reduced fertilizer run-off into watersheds, and improvestress tolerance and yield.

35S::G1274 overexpressing lines made less anthocyanin on high sucroseplus glutamine, indicating G1274 might be used to modify carbon andnitrogen status, and hence assimilate partitioning.

The morphological phenotype shown by 35S::G1274 lines indicated that thegene might be used to alter inflorescence architecture, to produce morecompact dwarf forms that might afford yield benefits.

The effects on leaf size that were observed as a result of G1274overexpression might also have commercial applications. Increased leafsize or an extended period of leaf growth, could increase photosyntheticcapacity, and biomass, and thus have a positive effect on yield.

G1275 (SEQ ID NO: 807)

Published Information

G1275 was first identified in the sequence of BAC T19G15 (GenBankaccession number AC005965).

Experimental Observations

The cDNA sequence of G1275 was determined. G1275 was ubiquitouslyexpressed, although expression levels differed among tissues. It ispossible that G1275 expression is induced by several stimuli, includinginfection by Erysiphe, Fusarium, and SA treatment.

The function(s) of G1275 were investigated using both knock-out mutantsand overexpressing plants in which this gene was expressed under thecontrol of the 35S promoter.

Primary transformants of G1275 were small with reduced apical dominance.The inflorescence stems produced by these plants did not elongatenormally. The plants were fertile, but seed yield was reduced becausethe plants were severely dwarfed.

In the knock-out mutant, the T-DNA insertion in G1275 was localized inthe second intron of the gene, which is located within the conservedWRKY-box. Such insertion would result in a null mutation (unless thelarge fragment of exogenous sequence is perfectly spliced out from thetranscribed G1275 pre-mRNA). G1275 knock-out mutant plants wereindistinguishable from wild-type controls in all assays performed.

Utilities

G11275 or its equivalogs might be used to alter plant development orarchitecture.

G1305 (SEQ ID NO: 819)

Published Information

G1305 is a member of the (R1)R2R3 subfamily of myb transcriptionfactors. G1305 corresponds to the gene MYB10 (Kranz et al. (1998) PlantJ. 16: 263-276).

Experimental Observations

The function of G1305 was analyzed using transgenic plants in which thegene was expressed under the control of the 35S promoter. Overexpressionof G1305 in Arabidopsis resulted in seedlings that were more tolerant toheat in a germination assay. Seedlings from G1305 overexpressingtransgenics were greener than the control seedlings under hightemperature conditions. In a repeat experiment, two lines showed theheat tolerant phenotype. In addition, plants from two of the 35S::Gl 305T2 lines flowered several days earlier than wild type in each of twoindependent sowings (24 hour light conditions). The plants had ratherflat leaves compared to controls and formed slightly thin inflorescencesin some cases.

According to RT-PCR, G1305 was expressed ubiquitously and expression ofthe gene was unaltered in response to the environmental stress-relatedconditions tested.

Utilities

On the basis of the analyses performed to date, the potential utility ofG1305 or its equivalogs is to regulate a plant's time to flower.

G1305 or its equivalogs may also be used to improve heat tolerance atgermination. The germination of many crops is very sensitive totemperature. A gene that would enhance germination in hot conditions maybe useful for crops that are planted late in the season or in hotclimates.

G1313 (SEQ ID NO: 829)

Published Information

G1313 (At5g06100) corresponds to AtMYB33. Gocal et al. (2001) PlantPhysiol. 127:1682-1693) showed that G1313 (AtMYB33) could bind to the GAresponse element and activate the barley alpha-amylase promoter in atransient assay in barley aleurone cells. The gene is ubiquitouslyexpressed in Arabidopsis. It was hypothesized that the gene couldregulate GA responsive pathways that promote flowering in Arabidopsis.To test this hypothesis Gocal et al. (2001) supra, analyzed whetherAtMYB33 is capable of binding to the LFY gene promoter. LFY is a floralmeristem identity gene that has a GA responsive element in its promoter.AtMYB33 was found to bind to the LFY promoter suggesting that the actionof GA on flowering could be mediated through the activity of AtMYB33(Gocal et al. (2001) supra).

Experimental Observations

The complete sequence of G1313 was determined. The function of this genewas analyzed using transgenic plants in which G1313 was expressed underthe control of the 35S promoter. 35S::G1313 transgenics were wild-typein response to all physiological stress treatments performed.

Interestingly, overexpression of G1313 produced an apparent increase inseedling vigor in some of the T1 plants at an early seedling stage,under normal growth conditions, compared to the wild-type controls. Thiseffect was observed in single T2 line in both the morphology andphysiology assays. Given that GAs are known to promote seed germination,the increased seedling vigor could be related to a GA response in seeds.The lack on an effect of G1313 on flowering time could result from thefact that an additional factor is required for the activity of theprotein. It should be noted that all the assays were performed undercontinuous light; since GAs are known to be critical for the floraltransition to occur under SDs (8-hour light) in Arabidopsis, 35S::G1313lines may have an altered flowering time response under such conditions.

Utilities

The increase in seedling vigor in G1313 transgenics plants suggestedthis gene could be used to increased survivability and vigor of smallseedlings under field conditions potentially leading to a greater yieldin crops. Published results suggested that the gene might also be usedto modify flowering time in commercial species.

G1322 (SEQ ID NO: 841)

Published Information

G1322 is a member of the (R1)R2R3 subfamily of myb transcriptionfactors. G1322 corresponds to Myb57, a gene identified by Kranz et al.((1998) Plant J. 16: 263-276). The authors used a reverse-Northern blottechnique to study the expression of this gene in a variety of tissuesand under a variety of environmental conditions. They were unable todetect the expression of G1322 in any tissue or treatments tested (Kranzet al. (1998) Plant J. 16: 263-276).

Closely Related Genes from Other Species

G1322 shows sequence similarity with known genes from other plantspecies within the conserved Myb domain.

Experimental Observations

G1322 was analyzed using transgenic plants in which the gene wasexpressed under the control of the 35S promoter. 35S::G1322 transgenicplants were wild-type in phenotype with respect to the biochemicalanalyses performed. Overexpression of G1322 produced changes in overallplant size and leaf development. At all stages, 35S::G1322 plants weredistinctly smaller than controls and developed curled dark-green leaves.Following the switch to flowering, the plants formed relatively thininflorescence stems and had a rather poor seed yield. In addition,overexpression of (G1322 resulted in plants with an altered etiolationresponse as well as enhanced tolerance to germination under chillingconditions. When germinated in the dark, G11322 overexpressingtransgenic plant lines had open, slightly green cotyledons. Underchilling conditions, all three transgenic lines displayed a similargermination response, seedlings were slightly larger and had longerroots. In addition, an increase in the leaf glucosinolate M39480 wasobserved in all three T2 lines. According to RT-PCR analysis, G1322 wasexpressed primarily in flower tissue.

Utilities

The utilities of G11322 or its equivalogs include altering a plant'schilling sensitivity and altering a plant's light response. Thegermination of many crops is very sensitive to cold temperatures. A genethat will enhance germination and seedling vigor in the cold hastremendous utility in allowing seeds to be planted earlier in the seasonwith a higher survival rate.

G1322 or its equivalogs can also be useful for altering leafglucosinolate composition. Increases or decreases in specificglucosinolates or total glucosinolate content are desirable dependingupon the particular application. Modification of glucosinolatecomposition or quantity can therefore afford increased protection frompredators. Furthermore, in edible crops, tissue specific promoters canbe used to ensure that these compounds accumulate specifically intissues, such as the epidermis, which are not taken for consumption.

G1323 (SEQ ID NO: 843)

Published Information

Kranz et al. ((1998) Plant J. 16: 263-276) published a partial cDNAsequence corresponding to G1323, naming it MYB58. Reverse-Northern dataindicates that this gene is expressed primarily in leaf tissue.

Experimental Observations

The complete sequence of G1323 was determined. As determined by RT-PCR,G1323 was highly expressed in embryos, and was expressed atsignificantly lower levels in the other tissues tested. G1323 expressionwas not induced by any stress-related treatments. The function of thisgene was analyzed using transgenic plants in which G1323 was expressedunder the control of the 35S promoter. Primary transformants of G1323were uniformly small and dark green, and a few were late flowering.According to the biochemical analysis of G1323 overexpressors, two hadhigher seed protein. The higher seed protein and lower seed oil contentwas observed in a repeated experiment.

Utilities

G1323 or its equivalogs could be used to alter seed protein and oilamounts and/or composition, which is very important for the nutritionalvalue and production of various food products. G1332 (SEQ ID NO: 855)

Published Information

G1332 is a member of the (R1)R2R3 subfamily of myb transcriptionfactors. G1332 corresponds to the gene MYB82 (Kranz et al. (1998) PlantJ. 16: 263-276).

Experimental Observations

The function of G1332 was analyzed using transgenic plants in which thegene was expressed under the control of the 35S promoter. Qverexpressionof G1332 produced a reduction in trichome density on leaf surfaces andinflorescence stems in Arabidopsis. No other phenotypic alterations wereobserved in the G1332 overexpressors.

G1332 was expressed ubiquitously and may have been repressed by Erysipheinfection.

Utilities

The potential utility of this gene or its equivalogs is to altertrichome initiation and number in a plant. It would be of greatagronomic value to have plants that produce greater numbers of glandulartrichomes that produce valuable essential oils for the pharmaceuticaland food industries, as well as oils that protect plants against insectand pathogen attack.

G1412 (SEQ ID NO: 879)

Published Information

G1412 is a member of the NAC family of transcription factors. G11412corresponds to gene At4g27410, annotated by the Arabidopsis GenomeInitiative. The gene corresponds to sequence 1543 from patentapplication WO0216655 A2 on stress-regulated genes, transgenic plantsand methods of use. In this publication, G1412 was reported to be cold,osmotic and salt responsive in microarray analysis. No information isavailable about the function(s) of G1412.

Experimental Observations

RT-PCR was used to analyze the endogenous expression pattern of G1412.The gene appeared to be constitutively expressed in all tissues tested.Furthermore, induction of G1412 in leaf tissue was observed in responseto ABA, heat, drought, and mannitol.

A T-DNA insertion mutant for G1412 was then analyzed. The mutantdisplayed a wild-type morphology, and was wild-type in its response tothe physiological analyses that were performed.

G1412 overexpressing transformants displayed wild-type morphology.However, the 35S::G1412 transgenics were insensitive to ABA and weremore tolerant to osmotic stress in a germination assay on mediacontaining high concentrations of sucrose.

Utilities

The phenotypic effects of G1412 overexpression, such as the increase inseedling vigor observed in a germination assay on high sucrose media andinsensitivity to germination on ABA media, indicated that the gene couldbe used to engineer plants with increased tolerance to abiotic stressessuch as drought, salt, heat or cold.

G1417 (SEQ ID NO: 881)

Published Information

G1417 corresponds to gene AT4g01720 (CAB77742).

Closely Related Genes from Other Species

G11417 shows sequence similarity, outside of the conserved WRKY domain,with a rice protein (gi8467950).

Experimental Observations

The function of G1417 was studied using a line homozygous for a T-DNAinsertion in the gene. The T-DNA insertion lies immediately upstream ofthe conserved WRKY domain coding sequence, and was expected to result ina null mutation. G1417 knockout mutant plants showed reduced seedlingvigor during germination. The G1417 knockout showed alterations in seedfatty acid composition. An increase in 18:2 fatty acid and a decrease in18:3 fatty acid were observed in two seed batches.

G1417 was ubiquitously expressed and did not appear to be significantlyinduced by any of the conditions tested.

Utilities

G1417 or its equivalogs could be useful to manipulate the saturationlevels of lipids in seeds. Alteration in seed lipid saturation could beused to improve the heat stability of oils or to improve the nutritionalquality of seed oil.

G1449 (SEQ ID NO: 891)

Published Information

G1449 is annotated in the sequence of genomic clone MKP6, GenBankaccession number AB022219, released by the Arabidopsis GenomeInitiative.

Experimental Observations

A cDNA clone corresponding to G1449 was isolated from an embryo cDNAlibrary. It was later identified in the sequence of genomic clone MKP6,GenBank accession number AB022219, released by the Arabidopsis GenomeInitiative.

G1449 was expressed at high levels in embryos and siliques, and atsignificantly lower levels in roots and seedlings. It was induced byauxin in leaf tissue. Plants overexpressing G1449 showed floralabnormalities. Primary transformants showed changes in floral organnumber and identity. Large petals were noted in one plant. Affectedlines were also somewhat smaller than controls. These plants producedlittle seed and it was necessary to bulk seed for analysis. One T3 lineproduced flowers that were somewhat larger than control flowers withpetals that were more open. These flowers often had extra petals. G1449mutant plants did not show any other phenotypic alterations in any ofthe physiological or biochemical assays performed.

Utilities Because larger and more open petals are produced in some G1449overexpressing plants, G1449 or its equivalogs may be useful formodifying flower form and size in ornamental plants. The promoter ofG1449 may also be useful to drive gene expression in seeds and seed podsor fruits. G1451 (SEQ ID NO: 893)

Published Information

G1451 is ARF8, a member of the ARF class of proteins with a VP1-likeN-terminal domain and a C-terminal domain with homology to Aux/IAAproteins. ARF8, like several other ARFs, contains a glutamine-richcentral domain that can function as a transcriptional activation domain(1). ARF8 was shown to bind to an auxin response element (2). It wasalso shown that a truncated version of ARF8 lacking the DNA bindingdomain but containing the activation domain and the C-terminal domaincould activate transcription on an auxin responsive promoter, presumablythrough interactions with another factor bound to the auxin responseelement (1). ARF8 is closely related in sequence to ARF6 (2).

Experimental Observations

G1451 was expressed throughout the plant, with the highest expression inflowers. Transcripts of G1451 were induced in leaves by a variety ofstress conditions. A line homozygous for a T-DNA insertion in G1451 wasused to determine the function of this gene. The T-DNA insertion ofG1451 is approximately one-fifth of the way into the coding sequence ofthe gene and therefore is likely to result in a null mutation.

As measured by NIR, G1451 knockout mutants had increased total combinedseed oil and seed protein content compared to wild-type plants.

Utilities

G1451 or its equivalogs may be used to alter seed oil and proteincontent, which may be very important for the nutritional value andproduction of various food products

G1451 or its equivalogs could also be used to increase plant biomass.Large size is useful in crops where the vegetative portion of the plantis the marketable portion since vegetative growth often stops whenplants make the transition to flowering.

G1468 (SEQ ID NO: 895)

Published Information

The genomic sequence of G1468 is located on the Arabidopsis BAC cloneT7123 (GenBank accession number U89959). There is no publishedinformation available regarding the function of G1468.

Experimental Observations

A T-DNA insertion mutant for G1468 behaved similarly to the wild-typecontrols in all morphological and physiological assays performed. G1468was predominantly expressed in flowers and embryos. Furthermore, itsexpression level was unaffected by any of the conditions tested.

The function of G1468 was studied using transgenic plants in which thegene was expressed under the control of the 35S promoter. Overexpressionof G11468 produced plants that were rather dark in coloration comparedto wild type controls at early stages. Severely affected individualsarrested growth early in vegetative development. Plants that survivedformed narrow, gray leaves and showed a marked delay in onset offlowering. Many of the late flowering plants had more axillary rosetteleaves compared to controls leading to an increase in vegetativebiomass.

Utilities

The alterations in leaf shape, size, and coloration shown by 35S::G1468transformants indicated that the gene or its equivalogs may be appliedto modify plant architecture.

The delayed bolting suggests that gene might also be used to manipulateflowering time in commercial species. In particular, an extension ofvegetative growth can significantly increase biomass and result insubstantial yield increases.

G1471 (SEQ ID NO: 897)

Published Information

G1471 was identified in the sequence of P1 clone MDK4, GenBank accessionnumber AB010695, released by the Arabidopsis Genome Initiative.

Experimental Observations

The function of this gene was analyzed using transgenic plants in whichG1471 was expressed under the control of the 35S promoter. All35S::G1471 primary transformants were markedly small, had narrow curledleaves and formed thin inflorescence stems. Flowers from many T1 plantswere extremely poorly developed, and often had organs missing, reducedin size, or highly contorted. Due to such defects, the fertility wasvery low, and approximately one third of the lines were tiny andcompletely sterile. Plants from one T2 generation line displayedwild-type morphology, indicating that the transgene might have becomesilenced. Two lines, however, were small, had narrow curled leaves andflowered marginally earlier than controls. The phenotype of thesetransgenic plants was wild-type in all other assays performed. G1471appeared to be expressed at medium levels in siliques and embryos.

G1471 overexpressing plants were found to have increased seed oilcontent compared to wild-type plants.

Utilities

G1471 or equivalog overexpression may be used to increase seed oilcontent in plants.

Because expression of G1471 is embryo and silique specific, its promotercould be useful for targeted gene expression in these tissues.

G1482 (SEQ ID NO: 905)

Published Information

G1482 was identified as a gene in the sequence of BAC AC006434, releasedby the Arabidopsis Genome Initiative.

Experimental Observations

The sequence of G1482 was experimentally determined. The data presentedfor this gene are from plants homozygous for a T-DNA insertion in G1482.The T-DNA insertion of G1482 is in coding sequence and therefore thisknockout mutant is likely to contain a null allele. Homozygous plantsharboring a T-DNA insertion in G1482 displayed significantly more rootgrowth on MS control plates as well as on different stresses in threeseparate experiments. G1482 was constitutively expressed andsignificantly induced by auxin, ABA and osmotic stress.

The function of G1482 was studied using transgenic plants in which thegene was expressed under the control of the 35S promoter. Plantsoverexpressing G1482 contained high levels of anthocyanins.

Utilities

Based on the phenotypes produced when this gene is knocked out, G1482 orits equivalogs could be used to manipulate root growth, particularly inresponse to environmental stresses such as drought and low nutrients.

G1482 or its equivalogs could also be used to alter anthocyaninproduction. The potential utilities of this gene includes alterations inpigment production for horticultural purposes, and possibly increasingstress resistance in combination with another transcription factor.Flavonoids have antimicrobial activity and could be used to engineerpathogen resistance. Several flavonoid compounds have health promotingeffects such as the inhibition of tumor growth and cancer, prevention ofbone loss and the prevention of the oxidation of lipids. Increasinglevels of condensed tannins, whose biosynthetic pathway is shared withanthocyanin biosynthesis, in forage legumes is an important agronomictrait because they prevent pasture bloat by collapsing protein foamswithin the rumen. For a review on the utilities of flavonoids and theirderivatives, see Dixon et al. (1999) Trends Plant Sci. 10: 394-400.

G1499 (SEQ ID NO: 913)

Published Information

The sequence of G1499 was obtained from the Arabidopsis genomesequencing project, GenBank accession number AB020752, based on itssequence similarity within the conserved domain to other bHLH relatedproteins in Arabidopsis.

Closely Related Genes from Other Species

The similarity between G1499 and Brassica rapa subsp. pekinensis flowerbud cDNA (acc#AT002234) is significant not only in the conserved bHLHdomains but also outside of the domains.

Experimental Observations

The function of G1499 was analyzed using transgenic plants in whichG1499 was expressed under the control of the 35S promoter. A range ofphenotypes was observed in primary transformants of G1499. The mostseverely affected plants were smaller than controls, dark green, withstrongly curled leaves, and produced bolts that terminated without aninflorescence. In some cases, flowers were replaced with filamentousstructures or carpelloid structures. Less severely affected linesproduced flowers where sepals were converted to carpelloid tissue.Petals and stamens were absent or reduced in size and number. Mildlyaffected T1 plants that were small in size but produced normal flowerswere taken to the T2 generation. Three T2 lines produced plants thatwere smaller than controls, darker green, and had narrower leaves.

G1499 overexpressors were similar to their wild-type counterparts in allphysiological and biochemical assays.

G1499 was predominantly expressed in the reproductive tissues such asflower, embryo and silique. Lower levels of expression were alsodetected in roots and germinating seeds. It's expression level wasunaffected by any of the environmental conditions tested.

Phenotypes produced by overexpressing G1499 and G779 were similar in theaspects of flower structures. Cluster analysis using basichelix-loop-helix motif revealed that both proteins of G1499 and G779 areclosely related.

Utilities

G1499 or its equivalogs could be used to modify plant architecture anddevelopment, including flower structure. If expressed under aflower-specific promoter, it might also be useful for engineering malesterility. Because expression of G1499 is flower and embryo specific,its promoter could be useful for targeted gene expression in thesetissues.

Potential utilities of this gene or its equivalogs also includeincreasing chlorophyll content, allowing more growth and productivity inconditions of low light. With a potentially higher photosynthetic rate,fruits could have higher sugar content. Increased carotenoid contentcould be used as a nutraceutical to produce foods with greaterantioxidant capability.

G1540 (SEQ ID NO: 919)

Published Information

G1540 is the Arabidopsis WUSCHEL (WUS) gene and encodes a novel subclassof homeodomain protein (Mayer et al. (1998) Cell 95:805-815).

WUS is a key developmental protein that has a core role in regulatingthe fate of stem cells within Arabidopsis apical meristems. The centralzone of an apical meristem contains a pool of undifferentiatedpluripotent stems cells. These stem cells are able to both maintainthemselves and supply cells for incorporation into new organs on theperiphery of the meristem (shoot meristems initiate leaves whereasflower meristems initiate whorls of floral organs).

Defects are visible in the shoots and flowers of wus mutants (Laux etal. (1996) Development 122: 87-96; Endrizzi et al. (1996) Plant J.10:967-979). Wus mutants fail to properly organize a shoot meristem inthe developing embryo. Post-embryonically, wus shoot meristems becomeflattened and terminate growth prematurely. Leaf primordia and secondaryshoots often initiate ectopically across the surface of these terminatedstructures. The leaf primordia usually develop into a disorganized bunchand a secondary shoot meristem takes over growth. This secondarymeristem then terminates and the developmental pattern is repeated,leading to a plant with no clear main axis of growth and clusters ofleaves at the tips of shoots. Wus floral meristems exhibit a comparablephenotype to the shoot meristem; development often ceases prematurelysuch that flowers either lack the innermost whorls of organs, or possessa single stamen in place of the inner whorls.

The mutant phenotype indicates that wus is required to maintain theidentity of the central zone within apical meristems and prevent thosecells from becoming differentiated. In situ expression patterns of WUSRNA support such a conclusion; WUS is first observed in the embryonicshoot meristem at the 16-cell stage. Later, expression becomes confinedto small groups of cells (in shoot and floral meristems) at the base ofthe central zone where it specifies the fate of overlying cells as stemcells. WUS is thought to be expressed, and act, independently of anotherhomeobox gene, SHOOT MERISTEMLESS (STM), G431, which has a relatedfunction (Long et al. (1996) Development 125:3027-3035). STM isinitially required for the establishment of the shoot meristem duringembryogenesis. Later STM is expressed throughout the whole meristem domewhere, together with an antagonist, CLAVATA1, it regulates transition ofcells from the central zone towards differentiation and organ formationat the meristem periphery (Clarke et al. (1996) Development 122:1565-1575; Endrizzi et al. (1996) Plant J. 10:967-979). A currenthypothesis is that WUS specifies the identity of central stem cellswhereas STM allows the progeny of those cells to proliferate beforebeing partitioned into organ primordia (Mayer et al. (1998) Cell95:805-815).

The effects of WUS over-expression have not yet been published. However,based on the present model for WUS function, its ectopic expressionmight be expected to induce formation of ectopic meristematic stemcells.

Experimental Observations

Over-expressers for G1540 (WUSCHEL) formed callus-like structures onleaves, stems and floral organs. These observations correlate with theproposed role of WUS in specifying stem cell fate in meristems. In T1over-expressers, cells took on characteristics of stem cells atinappropriate locations, indicating that WUS was sufficient to specifystem cell identity.

Utilities

The over-expression phenotype indicates that G1540 is sufficient toconfer stem cell identity on plant cells, and thereby prevent them fromdifferentiating. The gene or its equivalogs might be of utility in themaintenance of plant cell lines grown in vitro, where thedifferentiation of those lines creates difficulties. The gene or itsequivalogs might also be applied to transformation systems forrecalcitrant species, where generation of callus is currentlyproblematic but is required as part of the transformation procedure.

G1560 (SEQ ID NO: 925)

Published Information

The heat shock transcription factor G1560 is a member of the class-AHSFs (Nover et al. (1996) Cell Stress & Chaperones 1:215-223)characterized by an extended HR-A/B oligomerization domain. G1560 isfound in the sequence of the chromosome 3, P1 clone: MW123 (GenBankaccession AB022223.1 GI:4159712), released by the Arabidopsis GenomeInitiative. The translational start and stop codons were correctlypredicted. No additional public information related to the functionalcharacterization of G1560 is available.

Closely Related Genes from Other Species

Amino acid sequence comparison with entries available from GenBank showsstrong similarity with plant HSFs of several species (Oryza sativa,Lycopersicon peruvianum, Medicago truncatula, Solanum tuberosum,Lycopersicon esculentum, Glycine max, Pisum sativum, Hordeum vulgaresubsp. Vulgare, Triticum aestivum and Lotus japonicus; see accompanyingBLAST reports).

Experimental Observations

The function of G1560 was analyzed through its ectopic overexpression inplants. Analysis of the endogenous level of G1560 transcripts by RT-PCRrevealed that this gene was predominantly expressed in shoots, flowers,embryo and siliques. G1560 expression was induced strongly in responseto heat shock treatment and moderately after auxin, ABA, drought andosmotic treatment. The inducibility of G1560 by heat shock treatmentsuggested that G1560 may play a central role in the regulation of theheat shock response in plants. Physiological analysis of G1560overexpressors revealed no changes compared to wild type control uponeither heat shock or osmotic stress treatment.

Overexpression of G1560 resulted in transgenic T1 and T2 plants withaltered morphological characteristics. Throughout development,transgenic 35S::G1560 T1 and T2 plants were smaller than wild-typeplants and showed abnormalities in flower development. In severalindependent lines floral organs, mainly petals and stamens, were poorlydeveloped or absent, and flower buds were generally smaller andround-shaped. This phenotype resulted in poor fertility and low seedyield. Current models regarding the mode of action of chaperones(regulated by HSFs) are insufficient to explain this phenotype at themolecular level, or to suggest if the phenotype is a direct or indirectconsequence of the overexpression of HSF G11560.

Utilities

The overexpression of G1560 resulted in plants of small size and alteredflower morphology. Alteration of G1560 expression could potentially beused to modify plant development and fertility.

G1634 (SEQ ID NO: 927)

Published Information

G1634 was identified in the sequence of BAC MJJ3, GenBank accessionnumber AB005237, released by the Arabidopsis Genome Initiative.

Experimental Observations

The complete sequence of G1634 was determined. cDNA microarray analysesof the endogenous levels of G1634 indicated that this gene was primarilyexpressed in root and silique tissues. In addition, G1634 expression wasnot altered significantly in response to any of the stress-relatedtreatments tested. The function of this gene was analyzed usingtransgenic plants in which G1634 was expressed under the control of the35S promoter. The phenotype of these transgenic plants was wild-type inall assays performed.

G1634 overexpressors were found to have altered seed protein contentcompared to wild-type plants.

Utilities

G1634 or its equivalogs could be used to alter seed protein amountswhich is very important for the nutritional value and production ofvarious food products.

G1645 (SEQ ID NO: 929)

Published Information

G1645 is a member of the (R1)R2R3 subfamily of MYB transcriptionfactors. G1645 was identified in the sequence of BAC T24P13, GenBankaccession number AC006535, released by the Arabidopsis GenomeInitiative.

Closely Related Genes from Other Species

G1645 shows extensive sequence similarity to MYB proteins from otherplant species including tomato (AW624217), and alfalfa (AQ917084).

Experimental Observations

The function of G11645 was analyzed using transgenic plants in which thegene was expressed under the control of the 35 S promoter.Overexpression of G1645 produced marked changes in Arabidopsis leaf,flower and shoot development. These effects were observed, to varyingextents, in the majority of 35S::G1645 primary transformants.

At early stages, many 35S::G1645 T1 lines appeared slightly small andmost had rather rounded leaves. However, later, as the leaves expanded,in many cases they became misshapen and highly contorted. Furthermore,some of the lines grew slowly and bolted markedly later than controlplants. Following the switch to flowering, 35S::G1645 inflorescencesoften showed aberrant growth patterns, and had a reduction in apicaldominance. Additionally, the flowers were frequently abnormal and hadorgans missing, reduced in size, or contorted. Pollen production alsoappeared poor in some instances. Due to these deficiencies, thefertility of many of the 35S::G1645 lines was low and only small numbersof seeds were produced.

Overexpression of G1645 resulted in a low germination efficiency whengerminated on the 32C heat stress.

As determined by RT-PCR, G11645 was expressed in flowers, embryos,germinating seeds and siliques. No expression of G1645 was detected inthe other tissues tested. G1645 expression appeared to be repressed inrosette leaves infected with the phytopathogen Erysiphe orontii.

Utilities

G1645 or its equivalogs could be used to alter inflorescence structure,which may have value in production of novel ornamental plants.

G1645 or equivalog activity could be used to alter a plant's response toheat stress.

G1760 (SEQ ID NO: 937)

Published Information

G1760 was identified in the sequence of BAC clone F20D10 (geneAT4g37940/F20D10.60, GenBank accession number CAB80459). G1760 alsocorresponds to AGL21. A phylogenetic analysis of the Arabidopsis MADSbox gene family situated G1760/AGL21 in the same clade as ANR1 andAGL17, which are root-specific (Alvarez-Buylla et al. (2000) Proc. Natl.Acad. Sci. USA. 97:5328-5333). No information is available about thefunction(s) of G1760/AGL21.

Closely Related Genes from Other Species

G1760 shows sequence similarity with a tomato gene represented inGenBank by an EST: AW219962 EST302445 tomato root during/after fruitset, Lycopersicon esculentum cDNA clone cLEX6M17. Since similaritybetween the two genes extends beyond the conserved MADS domain, andbecause of the fact the tomato gene represented by EST AW219962 is alsoexpressed in roots, both genes might be related in function.

Experimental Observations

The function of G1760 was studied using transgenic plants in which thisgene was expressed under the control of the 35S promoter. Overexpressionof G1760 consistently reduced the time to flowering. G1760overexpressing plants did not show any other morphological orphysiological alterations in the assays that were performed. In fact,overexpression of G11760 was not observed to have deleterious effects:35S::G1760 plants were healthy and attained a wild-type stature whenmature. To date, a substantial number of genes have been found topromote flowering. Many, however, including those encoding thetranscription factors, APETALA1, LEAFY, and CONSTANS, produce extremedwarfing and/or shoot termination when over-expressed.

G1760 was specifically expressed in roots, in good agreement with itsposition within the Arabidopsis MADS-box gene family phylogenetic tree(Alvarez-Buylla et al. (2000) supra). The expression of G1760 appearedto be ectopically induced by drought stress.

Utilities

G1760 could be used to modify a plant's flowering time characteristics.In addition, its promoter could be used to confer root specific geneexpression, as well as to engineer responsiveness to drought.

In general, a wide variety of applications exist for systems that eitherlengthen or shorten the time to flowering.

G1816 (SEQ ID NO: 939)

Published Information

G1816 is a member of the MYB-related class of transcription factors. Thegene corresponds to TRIPTYCHON (TRY), and has recently been shown to beinvolved in the lateral inhibition during epidermal cell specificationin the leaf and root (Schellmann et al. (2002) EMBO J. 21: 5036-5046).The model proposes that TRY (G1816) and CPC (G225) function asrepressors of trichome and atrichoblast cell fate. TRY loss-of-functionmutants form ectopic trichomes on the leaf surface. TRY gain-of-functionmutants are glabrous and form ectopic root hairs.

Experimental Observations

The complete sequence of G1816 was determined. The function of the genewas studied using transgenic plants in which G1816 was expressed underthe control of the 35S promoter. Consistent with the morphologicalphenotypes published for the 35S::TRY overexpressors, the transgenicplants were glabrous and formed ectopic root hairs. These transgeniclines were also more tolerant to growth under N limiting conditions,both in a germination assay as well as a root growth assay on olderseedlings. The N germination assay looked for alterations in C:N sensingthrough the increase or decrease in anthocyanin production of thetransgenics relative to the controls. 35S::G1816 transgenic linesproduced less anthocyanin than wild-type controls. However, in additionto the N tolerance phenotypes observed in these transgenic lines, the35S::G1816 plants were also insensitive to growth retardation effects ofgermination on conditions of high glucose, suggesting that this genecould play a role in sugar sensing responses in the plant or osmoticstress tolerance. Genes for many sugar sensing mutants are allelic togenes involved in ABA and ethylene signaling (Rolland et al. (2002)Plant Cell 14 Suppl: S1185-S205. Therefore, G1816 could also be involvedin hormone signaling pathways.

Utilities

The phenotypic effects of G1816 overexpression, such as the increase inroot hair formation and the increase in seedling vigor observed in agermination assay on high glucose media, indicated that the gene couldbe used to engineer plants with increased tolerance to abiotic stressessuch as drought, salt, heat or cold.

In addition, the enhanced performance of G1816 overexpression linesunder low nitrogen conditions indicated that the gene could be used toengineer crops that could thrive under conditions of reduced nitrogenavailability.

The effect of G1816 overexpression on insensitivity to glucose in agermination assay, indicated that the gene could be involved in sugarsensing responses in the plant. The potential utilities of a geneinvolved in glucose-specific sugar sensing are to alter energy balance,photosynthetic rate, carbohydrate accumulation, biomass production,source-sink relationships, and senescence.

G1816 could also be used to alter anthocyanin production and trichomeformation in leaves

G1820 (SEQ ID NO: 941)

Published Information

G1820 is a member of the Hap5 subfamily of CCAAT-box-bindingtranscription factors. G1820 was identified as part of the BAC cloneMBA10, accession number AB025619 released by the Arabidopsis Genomesequencing project.

Closely Related Genes from Other Species

G1820 is closely related to a soybean gene represented by EST335784isolated from leaves infected with Colletotrichum trifolii. Similaritybetween G1820 and the soybean gene extends beyond the signature motif ofthe family to a level that would suggest the genes are orthologous.Therefore the gene represented by EST335784 may have a function and/orutility similar to that of G1820.

Experimental Observations

The complete sequence of G1820 was determined. The function of this genewas analyzed using transgenic plants in which G1820 was expressed underthe control of the 35S promoter. G1820 overexpressing lines showed moretolerance to salt stress in a germination assay. They also showedinsensitivity to ABA, with the three lines analyzed showing thephenotype. The salt and ABA phenotypes could be related to the plantsincreased tolerance to osmotic stress because in a severe waterdeprivation assay, G1820 overexpressors are, again, more tolerant.

Interestingly, overexpression of G1820 also consistently reduced thetime to flowering. Under continuous light conditions at 20-25 C, the35S::G1820 transformants displayed visible flower buds several daysearlier than control plants. The primary shoots of these plantstypically started flower initiation 1-4 leaf plastochrons sooner thanthose of wild type. Such effects were observed in all three T2populations and in a substantial number of primary transformants.

When biochemical assays were performed, some changes in leaf fames weredetected. In one line, an increase in the percentage of 18:3 and adecrease in 16:1 were observed. Otherwise, G11820 overexpressors behavedsimilarly to wild-type controls in all biochemical assays performed. Asdetermined by RT-PCR, G1820 was highly expressed in embryos andsiliques. No expression of G1820 was detected in the other tissuestested. G1820 expression appeared to be induced in rosette leaves bycold and drought stress treatments, and overexpressing lines showedtolerance to water deficit and high salt conditions.

One possible explanation for the complexity of the G1820 overexpressionphenotype is that the gene is somehow involved in the cross talk betweenABA and GA signal transduction pathways. It is well known that seeddormancy and germination are regulated by the plant hormones abscisicacid (ABA) and gibberellin (GA). These two hormones act antagonisticallywith each other. ABA induces seed dormancy in maturing embryos andinhibits germination of seeds. GA breaks seed dormancy and promotesgermination. It is conceivable that the flowering time and ABAinsensitive phenotypes observed in the G1820 overexpressors are relatedto an enhanced sensitivity to GA, or an increase in the level of GA, andthat the phenotype of the overexpressors is unrelated to ABA. InArabidopsis, GA is thought to be required to promote flowering innon-inductive photoperiods. However, the drought and salt tolerantphenotypes would indicate that ABA signal transduction is also perturbedin these plants. It seems counterintuitive for a plant with salt anddrought tolerance to be ABA insensitive since ABA seems to activatesignal transduction pathways involved in tolerance to salt anddehydration stresses. One explanation is that ABA levels in the G1820overexpressors are also high but that the plant is unable to perceive ortransduce the signal.

G1820 overexpressors also had decreased seed oil content and increasedseed protein content compared to wild-type plants

Utilities

G1820 affects ABA sensitivity, and thus when transformed into a plantthis transcription factor or its equivalogs may diminish cold, drought,oxidative and other stress sensitivities, and also be used to alterplant architecture, and yield.

The osmotic stress results indicate that G1820 or its equivalogs couldbe used to alter a plant's response to water deficit conditions and canbe used to engineer plants with enhanced tolerance to drought, saltstress, and freezing. Evaporation from the soil surface causes upwardwater movement and salt accumulation in the upper soil layer where theseeds are placed. Thus, germination normally takes place at a saltconcentration much higher than the mean salt concentration of in thewhole soil profile. Increased salt tolerance during the germinationstage of a crop plant would impact survivability and yield.

G1820 or its equivalogs could also be used to accelerate flowering time.

G1820 or its equivalogs may be used to modify levels of saturation inoils.

G1820 or its equivalogs may be used to seed protein content.

The promoter of G1820 could be used to drive seed-specific geneexpression.

Utilities

G1820 or equivalog overexpression may be used to alter seed proteincontent, which may be very important for the nutritional value andproduction of various food products

G1842 (SEQ ID NO: 943)

Published Information

G1842 corresponds to F1505.2 (BAA97510). The high level of sequencesimilarity between G1842 and FLOWERING LOCUS C (Michaels and Amasino,1999; Sheldon et al., 1999) has been previously described (Ratcliffe etal. (2001) Plant Physiol. 126:122-132).

Experimental Observations

G1842 was recognized as a gene highly related to Arabidopsis FLOWERINGLOCUS C (FLC; Michaels et al. (1999) Plant Cell 11, 949-956; Sheldon etal. (1999) Plant Cell 11, 445-458), and to MADS AFFECTING FLOWERING1(Ratcliffe et al. (2001) Plant Physiol. 126:122-132). FLC acts as arepressor of flowering (Michaels et al. (1999) Plant Cell 11, 949-956;Sheldon et al. (1999) Plant Cell 11, 445-458). Similarly, G157/MAF1 cancause a delay in flowering time when overexpressed (Ratcliffe et al.(2001) Plant Physiol. 126:122-132.

The function of G1842 was studied using transgenic plants in which thisgene was expressed under the control of the 35S promoter. Overexpressionof G1842 reduced the time to flowering in the Columbia background. Noconsistent alterations were detected in 35S::G1842 plants in thephysiological and biochemical analyses that were performed.

Early flowering was observed in 13/21 35S::G1842 primary transformants:under continuous light conditions, these plants produced flower budsapproximately 1 week earlier than controls. A comparable phenotype wasalso noted in the T2 populations from each of the three lines examined.In a separate experiment, the 35S::G1842 transgene was transformed intoStockholm (a late flowering, vernalization-sensitive ecotype). Acomparable result was observed to that seen for Columbia: approximately50% of 35S::G1842 Stockholm plants flowered earlier than wild-typecontrols.

Although G1842 is highly related in sequence to G157, G859, and FLC, itsoverexpression reduced the time to flowering, whereas overexpression ofG157, G859, and FLC often caused a delay in flowering. In other words,whereas the function of G157, G859, and FLC appeared to repressflowering, G1842 was an activator of that process.

Utilities

G1842 or its equivalogs could be used to alter flowering time.

G1843 (SEQ ID NO: 945)

Published Information

G1843 corresponds to F1505.3 (BAA9751 1). There is no literaturepublished on G1843, except our own (Ratcliffe et al. (2001) PlantPhysiol. 126:122-132). G1843 belongs to a group of five ArabidopsisMADS-box genes that are highly related to FLC (G1759), a repressor ofthe floral transition, and that we have called MADS AFFECTING FLOWERING1-5 (Ratcliffe et al. (2001) Plant Physiol. 126:122-132). The publishedreport describes functional data for only MAFI (G157), but the sequencesimilarity among all the members of the group is noted.

Experimental Observations

The function of G11843 was studied using transgenic plants in which thegene was expressed under the control of the 35S promoter. Overexpressionof G11843 caused alterations in plant growth and development, inparticular a severe reduction in overall plant size, prematuresenescence, and early flowering. That G1843 caused an effect inflowering time was expected because of its sequence similarity to G1759(FLC), G157 (MAF1), and G859, G1842, and G1844. However, in contrast toall these other genes, which when overexpressed can alter flowering time(either delay or accelerate, depending on the gene) without severe sideeffects on the plant, overexpression of G1843 was severely detrimental.

Primary transformants for 35S::G1843 were consistently small, showedstunted growth, and formed poorly developed inflorescences that yieldedrelatively few seeds. The most severely affected of these plants werevery small, and died at early stages of development. Approximately 50%of the 35S::G1843 transformants were also markedly early flowering anddisplayed visible flower buds 1-7 days earlier than any of the wild-typecontrols. Most notably, the leaves of 35S::G1843 transformantsfrequently senesced prematurely. A total of six T2 lines weremorphologically examined; all exhibited (to varying extents) comparablephenotypes to those observed in the T1 generation, showing prematuresenescence and stunted growth. Due to these deleterious effects,however, an accurate determination of flowering time was difficult tomake in the T2 generation.

The deleterious effects caused by G1843 overexpression were also notedin the physiological analyses that were performed: in general, the G1843overexpressing lines showed reduced seedling vigor and were palecompared to wild-type controls. 35S::G1843 plants behaved otherwise likewild-type controls in the physiological assays.

No alterations were detected in 35S::G1843 plants in the biochemicalanalyses that were performed.

G1843 was ubiquitously expressed and did not appear to be significantlyinduced by any of the conditions tested.

G1947 (SEQ ID NO: 949)

Published Information

The heat shock transcription factor G1947 is a member of the class-AHSFs (Nover et al. (1996) Cell Stress Chaperones 1: 215-223)characterized by an extended HR-A/B oligomerization domain. G1947 isfound in the sequence of the chromosome 5 P 1 clone MQD 19 (GenBankaccession AB02665 1.1 GI:4757407), released by the Arabidopsis GenomeInitiative. The start codon was incorrectly predicted in the publicannotation.

Experimental Observations

Analysis of the endogenous level of G1947 transcripts by RT-PCR revealeda constitutive expression, with the highest expression levels in rosetteleaves and the lowest in shoots and roots. G1947 expression appeared tobe induced by a variety of physiological or environmental conditions(auxin, ABA, heat, drought and osmotic stress).

A line homozygous for a T-DNA insertion in G1947 was used to analyze thefunction of this gene. The insertion point is 163 nucleotides downstreamfrom the initiation codon of G11947, and therefore should result in anull mutation.

G11947 mutant plants formed inflorescences that grew for an extendedperiod of time, and continued to generate flowers for substantiallylonger than wild-type controls. In G11947 mutant plants, siliquedevelopment was generally poor: they were very short and contained onlya few irregularly shaped seeds. Thus, the extended phase of flowerproduction observed in G1947 knockout mutant plants might have been theresult of poor fertility, because extended production of flowers anddelayed floral organ abscission is often seen in sterile Arabidopsismutants. The basis for the reduced fertility of G1947 knockout plantswas not apparent from the morphology of their flowers. In addition, someinconsistent effects on seedling size were noted for G1947 knockoutmutants. No size differences were noted between rosette stage G1947knockout plants and controls, although at late stages the G1947 knockoutplants appeared bushier than controls, probably due to continued growthof the inflorescence stems.

No altered phenotypes were observed for G11947 knockout plants in any ofthe physiological or biochemical assays performed.

Utilities

G1947 or its equivalogs could be used to engineer infertility intransgenic plants. G1947 may also have utility in engineering plantswith longer-lasting flowers for the horticulture industry.

G2010 (SEQ ID NO: 951)

Published Information

G2010 is a member of the SBP family of transcription factors andcorresponds to sp14 (Cardon et al., 1999). Expression of sp14 isup-regulated during development under both long day and short dayconditions and is highly expressed in the inflorescence tissue.Expression of G2010 is localized to the rib meristem andinter-primordial regions of the inflorescence apex (Cardon et al (1999)Gene 237:91-104).

Closely Related Genes from Other Species

A gene related to G20 10 is squamosa-promoter binding protein 1 fromAntirrhinum majus.

Experimental Observations

The complete sequence of G2010 was determined. The function of this genewas analyzed using transgenic plants in which G2010 was expressed underthe control of the 35S promoter. Overexpression of G20 10 resulted in aclear reduction in time to flowering. Under continuous light conditions,at 20-25° C, three independent T2 lines of 35S::G2010 plants floweredapproximately one week earlier than wild-type controls. The primaryshoot of 35S::G2010 plants switched to reproductive growth afterproducing 5-6 rosette leaves, compared with 8-10 rosette leaves incontrols. Flower buds were first visible 12-14 days after sowing in35S::G2010 plants compared with approximately 20 days for wild type.35S::G2010 transformants were also observed to begin senescence soonerthan controls. Otherwise, plants overexpressing G2010 are wild-type inphenotype.

Expression of G2010 was not detected by RT-PCR in any of the tissuestested. G2010 was slightly induced in rosette leaves in response to heatand cold stress treatments as well as salicylic acid treatment. Theexpression profile for G2010 indicated that this gene is involved in aplant's transition to flowering normally and in response to stressfulenvironmental conditions.

Utilities

The potential utility of a gene such as G2010 or its equivalogs is toaccelerate flowering time.

G2347 (SEQ ID NO: 957)

Published Information

G2347 is a member of the SBP family of transcription factors andcorresponds to sp15 (Cardon et al., 1999). Expression of sp15 isup-regulated in seedlings during development under both long day andshort day conditions and is highly expressed in the inflorescencetissue. Expression of G2347 is specifically localized in theinflorescence apical meristem and young flowers (Cardon et al. (1999)Gene 237: 91-104).

Closely Related Genes from Other Species

The closest relative to G2347 is the Antirrhinum protein, SBP2(CAA63061). The similarity between these two proteins is extensiveenough to suggest they might have similar functions in a plant.

Experimental Observations

G2347 was analyzed using transgenic plants in which G2347 was expressedunder the control of the 35S promoter. Overexpression of G2347 markedlyreduced the time to flowering in Arabidopsis. This phenotype wasapparent in the majority of primary transformants and in all plants fromtwo out of the three T2 lines examined. Under continuous lightconditions, 35S::G2347 plants formed flower buds up a week earlier thanwild type. Many of the plants were rather small and spindly compared tocontrols. To demonstrate that overexpression of G2347 could induceflowering under less inductive photoperiods, two T2 lines were re-grownin 12 hour conditions; again, all plants from both lines bolted early,with some initiating flower buds up to two weeks sooner than wild type.As determined by RT-PCR, G2347 was highly expressed in rosette leavesand flowers, and to much lower levels in embryos and siliques. Noexpression of G2347 was detected in the other tissues tested. G2347expression was repressed by cold, and by auxin treatments and byinfection by Erysiphe. G2347 is also highly similar to the Arabidopsisprotein G2010. The level of homology between these two proteinssuggested they could have similar, overlapping, or redundant functionsin Arabidopsis. In support of this hypothesis, overexpression of bothG2010 and G2347 resulted in early flowering phenotypes in transgenicplants.

Utilities

G2347 or its equivalogs may be used to modify the time to flowering inplants.

G2718 (SEQ ID NO: 959)

Published Information

G2718 (AT1G01380) was identified in the BAC clone, F6F3 (GenBankaccession AC023628). No published information on the function(s) ofG2718 is available, however, two highly related genes, TRY and CPC havebeen implicated in epidermal cell specification. The model proposes thatTRY (G1816) and CPC (G225) function as repressors of trichome andatrichoblast cell fate (Schellmann et al. (2002) EMBO J. 21: 5036-5046).

Experimental Observations

The function of G2718 was studied using plants in which the gene wasexpressed under the control of the 35S promoter. Overexpression of G2718resulted in a glabrous phenotype. The effect was highly penetrant, beingobserved in 17/17 primary transformants and each of three independent T2lines. All of the T1 lines showed a very strong phenotype and completelylacked trichomes on leaves and stems. A comparably severe effect wasobserved in one of the three T2 populations, whereas the other two T2populations each exhibited a weaker phenotype, suggesting that theeffect might have become partially silenced between the generations.Trichomes were present in these weaker lines, but at a much lowerdensity than in wild type.

In addition to the effects on trichome density, 35S::G2718 transformantswere also generally slightly smaller than wild type controls.

The above phenotypic effects of G2718 overexpression are consistent withthe observed effects for all of the members of the G2718 clade (G225,G226, G11826, and G682). In addition to the morphological effects, thephysiological effects are quite similar to the other members of theclade as well. However, due to the apparent silencing of the transgenein the T2 generation, only one of the lines displayed a strong response.This line was clearly glabrous, exhibited ectopic root hairs and wasmore tolerant to growth under conditions of limited nitrogen.

Utilities

The phenotypic effects of G2718 overexpression, such as the increase inroot hair formation and the increase in seedling vigor observed in aroot growth assay on N-limiting media, indicated that the gene could beused to engineer plants with increased tolerance to abiotic stressessuch as nutrient limitation, drought, salt, heat or cold.

The enhanced performance of G2718 overexpression lines under lownitrogen conditions indicated that the gene could be used to engineercrops that could thrive under conditions of reduced nitrogenavailability.

G2718 could also be used to alter anthocyanin production and trichomeformation in leaves.

Example IX Identification of Homologous Sequences

This example describes identification of genes that are orthologous toArabidopsis thaliana transcription factors from a computer homologysearch.

Homologous sequences, including those of paralogs and orthologs fromArabidopsis and other plant species, were identified using databasesequence search tools, such as the Basic Local Alignment Search Tool(BLAST) (Altschul et al. (1990) J. Mol. Biol. 215: 403-410; and Altschulet al. (1997) Nucleic Acid Res. 25: 3389-3402). The tblastx sequenceanalysis programs were employed using the BLOSUM-62 scoring matrix(Henikoff and Henikoff (1992) Proc. Natl. Acad. Sci. 89: 10915-10919).The entire NCBI GenBank database was filtered for sequences from allplants except Arabidopsis thaliana by selecting all entries in the NCBIGenBank database associated with NCBI taxonomic ID 33090 (Viridiplantae;all plants) and excluding entries associated with taxonomic ID 3701(Arabidopsis thaliana).

These sequences are compared to sequences representing transcriptionfactor genes presented in the Sequence Listing, using the WashingtonUniversity TBLASTX algorithm (version 2.0al9 MP) at the default settingsusing gapped alignments with the filter “off”. For each gene of SEQ IDNO: 2N-1, wherein N=1-480, individual comparisons were ordered byprobability score (P-value), where the score reflects the probabilitythat a particular alignment occurred by chance. For example, a score of3.6e-40 is 3.6×10−40. In addition to P-values, comparisons were alsoscored by percentage identity. Percentage identity reflects the degreeto which two segments of DNA or protein are identical over a particularlength. Examples of sequences so identified are presented in Table 7 andTable 9. Paralogous or orthologous sequences were readily identified andavailable in GenBank by Accession number (Table 7; Test sequence ID).The percent sequence identity among these sequences can be as low as47%, or even lower sequence identity.

Candidate paralogous sequences were identified among Arabidopsistranscription factors through alignment, identity, and phylogenicrelationships. A list of paralogs is shown in Table 9. Candidateorthologous sequences were identified from proprietary unigene sets ofplant gene sequences in Zea mays, Glycine max and Oryza sativa based onsignificant homology to Arabidopsis transcription factors. Thesecandidates were reciprocally compared to the set of Arabidopsistranscription factors. If the candidate showed maximal similarity in theprotein domain to the eliciting transcription factor or to a paralog ofthe eliciting transcription factor, then it was considered to be anortholog. Identified non-Arabidopsis sequences that were shown in thismanner to be orthologous to the Arabidopsis sequences are provided inTable 7.

Example X Screen of Plant cDNA library for Sequence Encoding aTranscription Factor

DNA Binding Domain That Binds To a Transcription Factor Binding PromoterElement and Demonstration of Protein Transcription Regulation Activity.

The “one-hybrid” strategy (Li and Herskowitz (1993) Science 262:1870-1874) is used to screen for plant cDNA clones encoding apolypeptide comprising a transcription factor DNA binding domain, aconserved domain. In brief, yeast strains are constructed that contain alacZ reporter gene with either wild-type or mutant transcription factorbinding promoter element sequences in place of the normal UAS (upstreamactivator sequence) of the GALL promoter. Yeast reporter strains areconstructed that carry transcription factor binding promoter elementsequences as UAS elements are operably linked upstream (5′) of a lacZreporter gene with a minimal GAL1 promoter. The strains are transformedwith a plant expression library that contains random cDNA inserts fusedto the GAL4 activation domain (GAL4-ACT) and screened for blue colonyformation on X-gal-treated filters (X-gal:5-bromo-4-chloro-3-indolyl-β-D-galactoside; Invitrogen Corporation,Carlsbad Calif.). Alternatively, the strains are transformed with a cDNApolynucleotide encoding a known transcription factor DNA binding domainpolypeptide sequence.

Yeast strains carrying these reporter constructs produce low levels ofbeta-galactosidase and form white colonies on filters containing X-gal.The reporter strains carrying wild-type transcription factor bindingpromoter element sequences are transformed with a polynucleotide thatencodes a polypeptide comprising a plant transcription factor DNAbinding domain operably linked to the acidic activator domain of theyeast GAL4 transcription factor, “GAL4-ACT”. The clones that contain apolynucleotide encoding a transcription factor DNA binding domainoperably linked to GLA4-ACT can bind upstream of the lacZ reporter genescarrying the wild-type transcription factor binding promoter elementsequence, activate transcription of the lacZ gene and result in yeastforming blue colonies on X-gal-treated filters.

Upon screening about 2×10⁶ yeast transformants, positive cDNA clones areisolated; i.e., clones that cause yeast strains carrying lacZ reportersoperably linked to wild-type transcription factor binding promoterelements to form blue colonies on X-gal-treated filters. The cDNA clonesdo not cause a yeast strain carrying a mutant type transcription factorbinding promoter elements fused to LacZ to turn blue. Thus, apolynucleotide encoding transcription factor DNA binding domain, aconserved domain, is shown to activate transcription of a gene.

Example XI Gel Shift Assays

The presence of a transcription factor comprising a DNA binding domainwhich binds to a DNA transcription factor binding element is evaluatedusing the following gel shift assay. The transcription factor isrecombinantly expressed and isolated from E. coli or isolated from plantmaterial. Total soluble protein, including transcription factor, (40 ng)is incubated at room temperature in 10 μl of 1× binding buffer (15 mMHEPES (pH 7.9), 1 mM EDTA, 30 mM KCl, 5% glycerol, 5% bovine serumalbumin, 1 mM DTT) plus 50 ng poly(dl-dC):poly(dl-dC) (Pharmacia,Piscataway N.J.) with or without 100 ng competitor DNA. After 10 minutesincubation, probe DNA comprising a DNA transcription factor bindingelement (1 ng) that has been ³²P-labeled by end-filling (Sambrook et al.(1989) supra) is added and the mixture incubated for an additional 10minutes. Samples are loaded onto polyacrylamide gels (4% w/v) andfractionated by electrophoresis at 150V for 2h (Sambrook et al. supra).The degree of transcription factor-probe DNA binding is visualized usingautoradiography. Probes and competitor DNAs are prepared fromoligonucleotide inserts ligated into the BamHI site of pUC118 (Vieira etal. (1987) Methods Enzymol. 153: 3-11). Orientation and concatenationnumber of the inserts are determined by dideoxy DNA sequence analysis(Sambrook et al. supra). Inserts are recovered after restrictiondigestion with EcoRI and HindIII and fractionation on polyacrylamidegels (12% w/v) (Sambrook et al. supra).

Example XII Introduction of Polynucleotides into Dicotyledonous Plants

Transcription factor sequences listed in the Sequence Listing recombinedinto pMEN20 or pMEN65 expression vectors are transformed into a plantfor the purpose of modifying plant traits. The cloning vector may beintroduced into a variety of cereal plants by means well known in theart such as, for example, direct DNA transfer or Agrobacteriumtumefaciens-mediated transformation. It is now routine to producetransgenic plants using most dicot plants (see Weissbach and Weissbach,(1989) supra; Gelvin et al. (1990) supra; Herrera-Estrella et al. (1983)supra; Bevan (1984) supra; and Klee (1985) supra). Methods for analysisof traits are routine in the art and examples are disclosed above.

Example XIII Transformation of Cereal Plants with an Expression Vector

Cereal plants such as, but not limited to, corn, wheat, rice, sorghum,or barley, may also be transformed with the present polynucleotidesequences in pMEN20 or pMEN65 expression vectors for the purpose ofmodifying plant traits. For example, pMEN020 may be modified to replacethe NptII coding region with the BAR gene of Streptomyces hygroscopicusthat confers resistance to phosphinothricin. The KpnI and BglII sites ofthe Bar gene are removed by site-directed mutagenesis with silent codonchanges.

The cloning vector may be introduced into a variety of cereal plants bymeans well known in the art such as, for example, direct DNA transfer orAgrobacterium tumefaciens-mediated transformation. It is now routine toproduce transgenic plants of most cereal crops (Vasil (1994) Plant Mol.Biol. 25: 925-937) such as corn, wheat, rice, sorghum (Cassas et al.(1993) Proc. Natl. Acad. Sci. 90: 11212-11216, and barley (Wan andLemeaux (1994) Plant Physiol. 104:37-48. DNA transfer methods such asthe microprojectile can be used for corn (Fromm et al. (1990)Bio/Technol. 8: 833-839); Gordon-Kamm et al. (1990) Plant Cell 2:603-618; Ishida (1990) Nature Biotechnol. 14:745-750), wheat (Vasil etal. (1992) Bio/Technol. 10:667-674; Vasil et al. (1993) Bio/Technol.11:1553-1558; Weeks et al. (1993) Plant Physiol. 102:1077-1084), rice(Christou (1991) Bio/Technol. 9:957-962; Hiei et al. (1994) Plant J.6:271-282; Aldemita and Hodges (1996) Planta 199:612-617; and Hiei etal. (1997) Plant Mol. Biol. 35:205-218). For most cereal plants,embryogenic cells derived from immature scutellum tissues are thepreferred cellular targets for transformation (Hiei et al. (1997) PlantMol. Biol. 35:205-218; Vasil (1994) Plant Mol. Biol. 25: 925-937).

Vectors according to the present invention may be transformed into cornembryogenic cells derived from immature scutellar tissue by usingmicroprojectile bombardment, with the A188XB73 genotype as the preferredgenotype (Fromm et al. (1990) Bio/Technol. 8: 833-839; Gordon-Kamm etal. (1990) Plant Cell 2: 603-618). After microprojectile bombardment thetissues are selected on phosphinothricin to identify the transgenicembryogenic cells (Gordon-Kamm et al. (1990) Plant Cell 2: 603-618).Transgenic plants are regenerated by standard corn regenerationtechniques (Fromm et al. (1990) Bio/Technol. 8: 833-839; Gordon-Kamm etal. (1990) Plant Cell 2: 603-618).

The plasmids prepared as described above can also be used to producetransgenic wheat and rice plants (Christou (1991) Bio/Technol.9:957-962; Hiei et al. (1994) Plant J. 6:271-282; Aldemita and Hodges(1996) Planta 199:612-617; and Hiei et al. (1997) Plant Mol. Biol.35:205-218) that coordinately express genes of interest by followingstandard transformation protocols known to those skilled in the art forrice and wheat (Vasil et al. (1992) Bio/Technol. 10:667-674; Vasil etal. (1993) Bio/Technol. 11:1553-1558; and Weeks et al. (1993) PlantPhysiol. 102:1077-1084), where the bar gene is used as the selectablemarker.

Example XIV Identification of Orthologous and Paralogous Sequences

Orthologs to Arabidopsis genes may identified by several methods,including hybridization, amplification, or bioinformatically. Thisexample describes how one may identify equivalogs to the Arabidopsis AP2family transcription factor CBF1 (polynucleotide SEQ ID NO: 43, encodedpolypeptide SEQ ID NO: 44), which confers tolerance to abiotic stresses(Thomashow et al. (2002) U.S. Pat. No. 6,417,428), and an example toconfirm the function of homologous sequences. In this example, orthologsto CBF1 were found in canola (Brassica napus) using polymerase chainreaction (PCR).

Degenerate primers were designed for regions of AP2 binding domain andoutside of the AP2 (carboxyl terminal domain): Mol 368 (reverse) 5′-CAYCCN ATH TAY MGN GGN GT-3′ (SEQ ID NO: 1942) Mol 378 (forward) 5′-GGN ARNARC ATN CCY TCN GCC-3′ (SEQ ID NO: 1943) (Y: C/T, N: A/C/G/T, H: A/C/T,M: A/C, R: A/G)

Primer Mol 368 is in the AP2 binding domain of CBF1 (amino acidsequence: His-Pro-Ile-TyrArg-Gly-Val) while primer Mol 378 is outsidethe AP2 domain (carboxyl terminal domain) (amino acid sequence:Met-Ala-Glu-Gly-Met-Leu-Leu-Pro).

The genomic DNA isolated from B. napus was PCR-amplified by using theseprimers following these conditions: an initial denaturation step of 2min at 93° C.; 35 cycles of 93° C. for 1 min, 55° C. for 1 min, and 720C for 1 min; and a final incubation of 7 min at 72° C. at the end ofcycling.

The PCR products were separated by electrophoresis on a 1.2% agarose geland transferred to nylon membrane and hybridized with the AT CBF I probeprepared from Arabidopsis genomic DNA by PCR amplification. Thehybridized products were visualized by colorimetric detection system(Boehringer Mannheim) and the corresponding bands from a similar agarosegel were isolated using the Qiagen Extraction Kit (Qiagen, ValenciaCalif.). The DNA fragments were ligated into the TA clone vector fromTOPO TA Cloning Kit (Invitrogen Corporation, Carlsbad Calif.) andtransformed into E. coli strain TOP 10 (Invitrogen).

Seven colonies were picked and the inserts were sequenced on an ABI 377machine from both strands of sense and antisense after plasmid DNAisolation. The DNA sequence was edited by sequencer and aligned with theAtCBF1 by GCG software and NCBI blast searching.

The nucleic acid sequence and amino acid sequence of one canola orthologfound in this manner (bnCBF1; polynucleotide SEQ ID NO: 1940 andpolypeptide SEQ ID NO: 1941) identified by this process is shown in theSequence Listing.

The aligned amino acid sequences show that the bnCBF1 gene has 88%identity with the Arabidopsis sequence in the AP2 domain region and 85%identity with the Arabidopsis sequence outside the AP2 domain whenaligned for two insertion sequences that are outside the AP2 domain.

Similarly, paralogous sequences to Arabidopsis genes, such as CBF1, mayalso be identified.

Two paralogs of CBF 1 from Arabidopsis thaliana: CBF2 and CBF3. CBF2 andCBF3 have been cloned and sequenced as described below. The sequences ofthe DNA SEQ ID NO: 45 and 47 and encoded proteins SEQ ID NO: 46 and 48are set forth in the Sequence Listing.

A lambda cDNA library prepared from RNA isolated from Arabidopsisthaliana ecotype Columbia (Lin and Thomashow (1992) Plant Physiol. 99:519-525) was screened for recombinant clones that carried insertsrelated to the CBF1 gene (Stockinger et al. (1997) Proc. Natl. Acad.Sci. 94:10351040). CBF1 was ³²P-radiolabeled by random priming (Sambrooket al. supra) and used to screen the library by the plaque-lifttechnique using standard stringent hybridization and wash conditions(Hajela et al. (1990) Plant Physiol. 93:1246-1252; Sambrook et al.supra) 6×SSPE buffer, 60° C. for hybridization and 0.1× SSPE buffer and60° C. for washes). Twelve positively hybridizing clones were obtainedand the DNA sequences of the cDNA inserts were determined. The resultsindicated that the clones fell into three classes. One class carriedinserts corresponding to CBF1. The two other classes carried sequencescorresponding to two different homologs of CBF1, designated CBF2 andCBF3. The nucleic acid sequences and predicted protein coding sequencesfor Arabidopsis CBF1, CBF2 and CBF3 are listed in the Sequence Listing(SEQ ID NOs: 43, 45, 47 and SEQ ID NOs: 44, 46, 48, respectively). Thenucleic acid sequences and predicted protein coding sequence forBrassica napus CBF ortholog is listed in the Sequence Listing (SEQ IDNOs: 1940 and 1941, respectively).

A comparison of the nucleic acid sequences of Arabidopsis CBF1, CBF2 andCBF3 indicate that they are 83 to 85% identical as shown in Table 13.TABLE 13 Percent identity^(a) DNA^(b) Polypeptide cbf1/cbf2 85 86cbf1/cbf3 83 84 cbf2/cbf3 84 85^(a)Percent identity was determined using the Clustal algorithm from theMegalign program (DNASTAR, Inc.).^(b)Comparisons of the nucleic acid sequences of the open reading framesare shown.

Similarly, the amino acid sequences of the three CBF polypeptides rangefrom 84 to 86% identity. An alignment of the three amino acidicsequences reveals that most of the differences in amino acid sequenceoccur in the acidic C-terminal half of the polypeptide. This region ofCBF1 serves as an activation domain in both yeast and Arabidopsis (notshown).

Residues 47 to 106 of CBF1 correspond to the AP2 domain of the protein,a DNA binding motif that to date, has only been found in plant proteins.A comparison of the AP2 domains of CBF1, CBF2 and CBF3 indicates thatthere are a few differences in amino acid sequence. These differences inamino acid sequence might have an effect on DNA binding specificity.

Example XV Transformation of Canola with a Plasmid Containing CBF1,CBF2, or CBF3

After identifying homologous genes to CBF1, canola was transformed witha plasmid containing the Arabidopsis CBF1, CBF2, or CBF3 genes clonedinto the vector pGA643 (An (1987) Methods Enzymol. 253: 292). In theseconstructs the CBF genes were expressed constitutively under the CaMV35S promoter. In addition, the CBF1 gene was cloned under the control ofthe Arabidopsis COR15 promoter in the same vector pGA643. Each constructwas transformed into Agrobacterium strain GV3101. TransformedAgrobacteria were grown for 2 days in minimal AB medium containingappropriate antibiotics.

Spring canola (B. napus cv. Westar) was transformed using the protocolof Moloney et al. (1989) Plant Cell Reports 8: 238, with somemodifications as described. Briefly, seeds were sterilized and plated onhalf strength MS medium, containing 1% sucrose. Plates were incubated at24° C. under 60-80 μE/m²s light using a 16 hour light/8 hour darkphotoperiod. Cotyledons from 4-5 day old seedlings were collected, thepetioles cut and dipped into the Agrobacterium solution. The dippedcotyledons were placed on co-cultivation medium at a density of 20cotyledons/plate and incubated as described above for 3 days. Explantswere transferred to the same media, but containing 300 mg/l timentin(SmithKline Beecham, Pa.) and thinned to 10 cotyledons/plate. After 7days explants were transferred to Selection/Regeneration medium.Transfers were continued every 2-3 weeks (2 or 3 times) until shoots haddeveloped. Shoots were transferred to Shoot-Elongation medium every 2-3weeks. Healthy looking shoots were transferred to rooting medium. Oncegood roots had developed, the plants were placed into moist pottingsoil.

The transformed plants were then analyzed for the presence of the NPTIIgene/kanamycin resistance by ELISA, using the ELISA NPTII kit from5Prime-3Prime Inc. (Boulder, Colo.). Approximately 70% of the screenedplants were NPTII positive. Only those plants were further analyzed.

From Northern blot analysis of the plants that were transformed with theconstitutively expressing constructs, showed expression of the CBF genesand all CBF genes were capable of inducing the Brassica napuscold-regulated gene BN115 (homolog of the Arabidopsis COR15 gene). Mostof the transgenic plants appear to exhibit a normal growth phenotype. Asexpected, the transgenic plants are more freezing tolerant than thewild-type plants. Using the electrolyte leakage of leaves test, thecontrol showed a 50% leakage at −2 to −3 C. Spring canola transformedwith either CBF1 or CBF2 showed a 50% leakage at 6 to −7° C. Springcanola transformed with CBF3 shows a 50% leakage at about −10 to −15° C.Winter canola transformed with CBF3 may show a 50% leakage at about −16to −20° C. Furthermore, if the spring or winter canola are coldacclimated the transformed plants may exhibit a further increase infreezing tolerance of at least −2° C.

To test salinity tolerance of the transformed plants, plants werewatered with 150 mM NaCl. Plants overexpressing CBF1, CBF2, or CBF3 grewbetter compared with plants that had not been transformed with CBF1,CBF2, or CBF3.

These results demonstrate that equivalogs of Arabidopsis transcriptionfactors can be identified and shown to confer similar functions innon-Arabidopsis plant species.

Example XVI Cloning of Transcription Factor Promoters

Promoters are isolated from transcription factor genes that have geneexpression patterns useful for a range of applications, as determined bymethods well known in the art (including transcript profile analysiswith cDNA or oligonucleotide microarrays, Northern blot analysis,semi-quantitative or quantitative RT-PCR). Interesting gene expressionprofiles are revealed by determining transcript abundance for a selectedtranscription factor gene after exposure of plants to a range ofdifferent experimental conditions, and in a range of different tissue ororgan types, or developmental stages. Experimental conditions to whichplants are exposed for this purpose includes cold, heat, drought,osmotic challenge, varied hormone concentrations (ABA, GA, auxin,cytokinin, salicylic acid, brassinosteroid), pathogen and pestchallenge. The tissue types and developmental stages include stem; root,flower, rosette leaves, cauline leaves, siliques, germinating seed, andmeristematic tissue. The set of expression levels provides a patternthat is determined by the regulatory elements of the gene promoter.

Transcription factor promoters for the genes disclosed herein areobtained by cloning 1.5 kb to 2.0 kb of genomic sequence immediatelyupstream of the translation start codon for the coding sequence of theencoded transcription factor protein. This region includes the 5′-UTR ofthe transcription factor gene, which can comprise regulatory elements.The 1.5 kb to 2.0 kb region is cloned through PCR methods, using primersthat include one in the 3′ direction located at the translation startcodon (including appropriate adaptor sequence), and one in the 5′direction located from 1.5 kb to 2.0 kb upstream of the translationstart codon (including appropriate adaptor sequence). The desiredfragments are PCR-amplified from Arabidopsis Col-0 genomic DNA usinghigh-fidelity Taq DNA polymerase to minimize the incorporation of pointmutation(s). The cloning primers incorporate two rare restriction sites,such as NotI and Sfi1, found at low frequency throughout the Arabidopsisgenome. Additional restriction sites are used in the instances where aNotI or Sfi1 restriction site is present within the promoter.

The 1.5-2.0 kb fragment upstream from the translation start codon,including the 5′-untranslated region of the transcription factor, iscloned in a binary transformation vector immediately upstream of asuitable reporter gene, or a transactivator gene that is capable ofprogramming expression of a reporter gene in a second gene construct.Reporter genes used include green fluorescent protein (and relatedfluorescent protein color variants), beta-glucuronidase, and luciferase.Suitable transactivator genes include LexA-GAL4, along with atransactivatable reporter in a second binary plasmid (as disclosed inU.S. patent application Ser. No. 09/958,131, incorporated herein byreference). The binary plasmid(s) is transferred into Agrobacterium andthe structure of the plasmid confirmed by PCR. These strains areintroduced into Arabidopsis plants as described in other examples, andgene expression patterns determined according to standard methods knowto one skilled in the art for monitoring GFP fluorescence,beta-glucuronidase activity, or luminescence.

Example XVII Cloning and Transformation of MAF2 (SEQ ID NO: 567), MAF3(SEQ ID NO: 943), MAF4 (SEQ ID NO: 945), and MAF5 (SEQ ID NO: 947) intoArabidopsis Plants

Experiments were performed using Arabidopsis of ecotype Columbia (Col-0,Lot # 199-014; Lehle Seeds, Round Rock Tex.) except where otherwiseindicated. Stockholm accession (CS6863) and Pitztal accession (CS6832)lines were supplied by the Arabidopsis Biological Resource Center (ABRC;Ohio State University, Columbus Ohio). The fca-9 allele was in aColumbia background (Page et al. (1999) Plant J. 17: 231-239).Transgenic 35S:FLC lines were generated as described by Ratcliffe et al.(2001) supra.

Arabidopsis plants were transformed by the floral dip method (Bechtoldand Pelletier, 1998; Clough and Bent (1998) supra) using Agrobacteriumcarrying a standard transformation vector, pMEN20 or pMEN65 (MendelBiotechnology, Inc., Hayward Calif.), which contained a kanamycinresistance selectable marker system driven by a nos promoter and MAF1-5(SEQ ID NOs: 1734, 567, 943, 945, and 947, respectively) or FLC (SEQ IDNO: 1874) cDNA oriented 3′ to a CaMV 35S promoter.

In all experiments, seeds were sterilized by a 2 minute EtOH treatmentfollowed by 20 minutes in 30% bleach/0.01% Tween and five washes indistilled water. Seeds were sown to MS agar in 0.1% agarose andstratified for 3-5 days at 4° C., before transfer to growth rooms with atemperature of 20-250 C. MS media was supplemented with 50 mg/lkanamycin for selection of transformed plants. Plants were transplantedto soil after 8 days of growth on plates, when grown under continuouslight, and after 10 days when grown under 8 or 12 hour photoperiods. Forvernalization treatments, seeds were sown to MS agar plates, sealed withmicropore tape, and placed in a 4° C. cold room with low light levels.The plates were then transferred to the growth rooms alongside platescontaining freshly sown non-vernalized controls, which had received ashort cold stratification of 3 days (to synchronize germination). Timeto flowering was measured as days to flower a flower bud becomingvisible and/or in terms of the total number of leaf nodes formed by theprimary shoot meristem. Rosette leaves were counted when a visibleinflorescence of approximately three centimeters was apparent.

cDNAs for each of the FLC/MAF1-like (MAF) sequences were identifiedeither among clones in a library derived from leaf RNA, or by acombination of RACE and RT-PCR performed on RNA derived from mixedtissue samples of the Columbia accession. Alternative transcripts weredetected for each of the four genes. All these MAF sequences are listedin Table 5B. At least four variants of At5g65050 (MAF2; SEQ ID NO: 567)were identified; variant V (SEQ ID NO: 1950) encodes a 171 amino acidprotein (SEQ ID NO: 1951). Variants II and III (SEQ ID NOs: 1944 and1946) differ in their 3′ region but both generate a protein of 145 aminoacids (SEQ ID NOs: 1945 and 1946), the last 20 residues of which aredifferent from the 196 amino acid full-length version. Variant IV (SEQID NO: 1948) comprises a truncated form of variant I, and would giverise to a small polypeptide of 80 amino acids (SEQ ID NO: 1949).

At5g65060 (MAF3; SEQ ID NO: 943) and At5g65070 (MAF4; SEQ ID NO: 945)both displayed at least 5 variants. The longest MAF3 product, encoded byvariant I (SEQ ID NO: 943), is 196 amino acids in length (SEQ ID NO:944). MAF3 variants II and III (SEQ ID NOs: 1952 and 1954) encode 185(SEQ ID NO: 1953) and 118 (SEQ ID NO: 1955) amino acid proteinsrespectively, whilst variants IV and V (SEQ ID NO: 1956 and 1958) bothgenerate products of 77 amino acids in length (SEQ ID NOs: 1957 and1959, but differ at in their 3′ regions. The longest MAF4 cloneidentified, variant I (SEQ ID NO: 945) encodes a protein of 200 aminoacids (SEQ ID NO: 946). MAF4 variant II (SEQ ID NO: 1960) encodes a 136amino acid product (SEQ ID NO: 1961) whereas MAF4 variants III, IV, andV (SEQ ID NOs: 1962, 1964, and 1966) encode very short polypeptides of63 (SEQ ID NO: 1963), 66 (SEQ ID NO: 1965), and 69 (SEQ ID NO: 1967)amino acids respectively.

Two alternative variants of At5g65080 (MAF5) (SEQ ID NO: 947) wereidentified (SEQ ID NOs: 947 and 1968) which encode polypeptides of 198(SEQ ID NO: 948) and 184 (SEQ ID NO: 1969) amino acids, respectively.The significance of these alternative transcripts is unclear;alternative splicing for MAF1 has been described previously (Ratcliffeet al. (2001) supra; Scortecci et al. (2001) supra), whereas for FLC(SEQ ID NO: 1874), it has not been reported.

MAF2 variants II and III (SEQ ID NOs: 1944 and 1946) were randomlyisolated from an in-house library derived from Arabidopsis leaf mRNA.All other MAF clones were isolated by RT-PCR on RNA extracted from wholevegetative Columbia seedlings. RNA was extracted from plant tissue usinga CTAB based protocol (Jones et al. (1995) supra), poly(A)+ RNA waspurified using oligo(dT)-cellulose (Life Technologies, Inc., RockvilleMd.), and first stand cDNA synthesis was performed using a SUPERSCRIPTkit (Life Technologies).

To confirm the genes boundaries, 3′ RACE was first performed using aSMART RACE cDNA Amplification kit (Clontech, Palo Alto Calif.) and tworounds of PCR (30 cycles and 25 cycles) were performed using thefollowing gene specific primers: MAF2: first round,AAGAAGCAAAAAACATTGTGGGTCTCCG, (SEQ ID NO: 1974) second round,CGTCTCCGGCTCCGGAAAACTCTACAAG; (SEQ ID NO: 1975) MAF3: first round,CTGTTGTCGCCGTCTCCGGTTCCGGAAA, (SEQ ID NO: 1976) second round,ACTCTACGACTCTGCCTCCGGTGACAA: (SEQ ID NO: 1977) MAF4: first round,ATCAAACGAATTGAGAACAAAAGCTCTC, (SEQ ID NO: 1978) second round,CTTATCATCATCTCTGCCACCGGAAGAC; (SEQ ID NO: 1979) MAF5: first round,GGGGATTAGATGTGTCGGAAGAGTGAAG, (SEQ ID NO: 1980) second round,AACTCTACAACTCCTCCTCCGGCGACAG; (SEQ ID NO: 1981)

RACE products were then cloned to the pGEM-T Easy vector (Promega,Madison Wis.) and sequenced. Following RACE analysis, MAF cDNA cloneswere PCR-isolated from cDNA using the primers listed below, and thenligated into the transformation vector, following digestion with therestriction enzymes indicated below. MAF2: (KpnI/NotI)GAGGGGTACCACATTGTGGGTCTCCGGTGATTAGGATC and (SEQ ID NO: 1982)GGGAAAGCGGCCGCAATCAGGCTGTAAGTTTAGGTGAAAGC; (SEQ ID NO: 1983) MAF3:(KpnI/NotI) GAGGGGTACCAGAAAAAAAGCAAACACATTTTGGGTCC and (SEQ ID NO: 1984)GGGAAAGCGGCCGCACAAGAACTCTGATATTTGTCTACTAAG; (SEQ ID NO: 1985) MAF4:(SalI/NotI) GCACGCGTCGACCAAATTAGGTCAGAAGAATTAGTCGGAG and (SEQ ID NO:1986) GGGAAAGCGGCCGCTCTCCTTGGATGACTTTTCCGTAGCAGG; (SEQ ID NO: 1987)MAF5: (SalI/NotI) GCACGCGTCGACGGGGATTAGATGTGTCGGAAGAGTGAAG and (SEQ IDNO: 1988) GGGAAAGCGGCCGCGATCCTGTCTTCCAAGGTAACACAAAGG. (SEQ ID NO: 1989)

For semi-quantitative RT-PCR expression studies, the following primerswere used: FLC: TTAGTATCTCCGGCGACTTGAACCCAAACC (SEQ ID NO: 1990) andAGATTCTCAACAAGCTTCAACATGAGTTCG; (SEQ ID NO: 1991) MAF2:ACATTGTGGGTCTCCGGTGATTAGGATC (SEQ ID NO: 1992) andAATCAGGCTGTAAGTTTAAGGTGAAAGC; (SEQ ID NO: 1993) MAF3:GAAGAAAAAAAGCAAACACATTTTGGGTCC (SEQ ID NO: 1994) andAAGAACTCTGATATTTGTCTACTAAGGTAC; (SEQ ID NO: 1995) MAF4:ATTAGGTCAGAAGAATTAGTCGGAGAAAAC (SEQ ID NO: 1996) andCTTGGATGACTTTTCCGTAGCAGGGGGAAG; (SEQ ID NO: 1997) MAF5:GGGGATTAGATGTGTCGGAAGAGTGAAG (SEQ ID NO: 1998) andGATCCTGTCTTCCAAGGTAACACAAAGG; (SEQ ID NO: 1999) Actin:AGAGATTCAGATGCCCAGAAGTCTTGTTCC (SEQ ID NO: 2000) andAACGATTCCTGGACCTGCCTCATCATACTC; (SEQ ID NO: 2001) SOC1:GGCATACTAAGGATCGAGTCAGCACCAAAC (SEQ ID NO: 2002) andACCCAATGAACAATTGCGTCTCTACTTCAG. (SEQ ID NO: 2003)

The T-DNA insertion event within MAF2 was initially detected in a pooledcollection of approximately 3,000 lines, and then de-replicated to asingle plant, by multiple rounds of PCR using the following pairs ofT-DNA left border (LB) and gene specific (GS) primers: First round (40cycles): LB, CTCATCTAAGCCCCCATTTGGACGTGAATG and (SEQ ID NO: 2004) GS,CAGGCTGTAAGTTTAAGGTGAAAGCTCA. (SEQ ID NO: 2005) Second round (40cycles): LB nested, TTGCTTTCGCCTATAAATACGACGGATCG and (SEQ ID NO: 2006)GS nested, TGATGATGGTGATTACTTGAGCAGCGGA. (SEQ ID NO: 2007)

The insertion position was confirmed by sequencing of the PCR products.Homozygous plants for the MAF2 T-DNA insertion were then identified bythe absence of a band following 40 cycles of PCR with the following pairof gene specific primers: (SEQ ID NO:2008)AAGACAGAACTAATGATGGGGGAAGTGAAGTCC and (SEQ ID NO:2009)TACGAAGGTACAATAAAGATCTACTATAGCwhich resided on either side of the insertion locus.

Example XVIII Examples of Genes that Confer Significant Improvements toPlants

Examples of genes and homologs that confer significant improvements toknockout or overexpressing plants are noted below. Experimentalobservations made by us with regard to specific genes whose expressionhas been modified in overexpressing or knock-out plants, and potentialapplications based on these observations, are also presented.

SEQ ID NOs: 567, 943, 945, 947, 1734, 1874, 1970, and 1972 (G859, G1842,G1843, G1844, G157, G1759, Soy 1, and Soy3, respectively) and theencoded polypeptides can be used to alter flowering time.

SEQ ID NO: 1944, SEQ ID NO: 1946, SEQ ID NO: 1948, SEQ ID NO: 1950, SEQID NO: 1952, SEQ ID NO: 1954, SEQ ID NO: 1956, SEQ ID NO: 1958, SEQ IDNO: 1960, SEQ ID NO: 1962, SEQ ID NO: 1964, SEQ ID NO: 1966, and SEQ IDNO: 1968, and the encoded polypeptides can be used to alter floweringtime.

Differences in flowering time displayed by 35S::G859, 35S::G1842,35S::G1843, 35S::G1844, 35S::G157, and 35S::G1759 transformantsindicates that the gene or its homologs can be used to manipulate theflowering time of commercial species (see “Detailed description ofgenes, traits and utilities that affect plant characteristics; Floweringtime: late flowering”, above).

Example XIX Identification of a T-DNA Insertion Mutant for MAF2 (SEQ IDNO: 567)

A single plant hemizygous for a T-DNA insertion within At5g65050 wasidentified by PCR screening of an in-house collection of randominsertion lines. Sequencing from a primer matching the left T-DNA borderrevealed that the T-DNA resided within the predicted final intron of thegene (a position corresponding to nucleotide 3443 of At5g65050 (SEQ IDNO: 567), within the final intron of the putative full-lengthsplice-variant). Self crossed seed were collected from the individualcontaining the insertion, and these progeny were examined in the nextgeneration. The progeny plants were genotyped using PRC, and four out oftwenty individual plants were identified as being homozygous for theT-DNA insertion. These four individual plants all showed visible flowerbuds at 13 days under continuous light conditions, at least two daysearlier than any of the wild-type Columbia ecotype control plants, orany of their hemizygous or wild-type siblings growing in the same flat.Thus, it appeared that At5g65050 functioned as a repressor of the floraltransition. At5g65050 was therefore designated as MAF2 (MADS AFFECTINGFLOWERING 2; SEQ ID NO: 567). Homozygous seed was collected from thefour individual early flowering plants. To examine the effects of themutation on endogenous MAF2 expression, RNA was extracted from pooled10-day-old seedlings in the next generation. Semi-quantitative RT-PCRwas performed, using primers specific to MAF2; endogenous MAF2transcripts in the maf2 seedlings were undetectable, but strongendogenous MAF2 expression was detected in the wild-type controls,demonstrating that MAF2 activity had been substantially reduced oreliminated in the maf2 mutant (see FIG. 10).

FIG. 10 shows the effect of vernalization on the maf2 mutant: (A) themaf2 mutant is marginally earlier flowering than wild type Columbia, inthe absence of vernalization; plants are shown after 50 days growthunder a 12-hour photoperiod; imbibed seeds were stratified for 3 days at4° C. before transfer to the growth room; (B) the maf2 mutant isconsiderably earlier flowering than wild type Columbia following a shortvernalization treatment; plants are shown after 45 days under a 12-hourphotoperiod; imbibed seeds were cold treated for 15 days at 4° C. beforetransfer to the growth room; (C) the maf2 mutant responds prematurely tovernalization; chart in Table 15 depicts data from experiment 4; barsrepresent days to visible flower bud under a 12-hour photoperiod,following 4° C. cold-treatments of 3, 6, 10, 15, 21, and 85 days onimbibed seeds; error bars indicate standard errors to which 95%confidence limits have been attached; note that a 10-day cold treatmentsignificantly reduced time to flowering in the maf2 mutant, but not inwild type Columbia ecotype; (D) expression of FLC, MAF2 and SOC1 in maf2and wild type Columbia seedlings following cold treatments of 3, 6, 10,15, 21, and 85 days on imbibed seeds; RNA was extracted from pools often whole seedlings after 10 days growth under 12-hour light, andexpression monitored by reverse transcriptase PCR (FLC, 30 cycles; MAF2,35 cycles; SOC1, 30 cycles; actin, 25 cycles). The ‘forward slash’symbol (/) in FIG. 10 identifies a water control; the prematurevernalization response of maf2 seen in (C) does not appear to becorrelated with a premature decline in FLC levels, or a prematureincrease of SOC1; note that MAF2 transcript is absent from the maf2seedlings, but is present at a constant level between the 3, 6, 10, 15,and 21-day time points in wild type, then declines by the 85 day sample;note that the MAF2 product is a doublet corresponding to splice-variantsII and I.

Example XX MAF2 Functions as a Floral Repressor that PreventsVernalization in Response to Short Cold Periods

Populations of homozygous maf2 and wild-type plants were grown undercontinuous light, a 12 hour photoperiod, and an 8-hour photoperiod. Ineach case, the maf2 plants on average flowered slightly earlier than thecontrols in terms of both days to visible flower buds and total leafnumber (see experiment 1, Table 15). Thus, MAF2 acts as a floralrepressor, but appears to play a relatively minor role in determiningflowering time under the conditions of these experiments.

Null mutants for flc show a very much weaker response to vernalizationthan wild type (Michaels and Amasino (2001) supra). However, the factthat flc mutants can exhibit a vernalization response demonstrates thatother factors must contribute to maintenance of a vernalizationrequirement. To test whether MAF2 could be one of those factors, batchesof germinating wild type and maf2 seedlings were subjected to anextensive cold treatment of 52 days. The seedlings were then grownalongside non-vernalized controls under a 12-hour photoperiod (seeexperiment 2, Table 15). In this experiment, non-vernalized maf2 plantsproduced flower buds around 6 days sooner than non-vernalized wild type,confirming our earlier observations. However, maf2 seedlings showed asimilar response to wild type. Vernalization of maf2 seedlings reducedthe time to bud emergence by 31% and the total leaf number by 47% (withrespect to non-vernalized maf2 plants). By comparison, in the wild-typeseedlings, vernalization produced a 34% (time to bud emergence) and a53% (total leaf number) reduction, respectively. The vernalizationresponse of the maf2 mutant contrasts with the weaker response describedforflc mutants (Michaels and Amasino (2001) supra). Thus, although MAF2acts as a floral repressor, either it does not directly maintain avernalization requirement in the same manner as FLC, or has a more minorrole in the maintenance of a vernalization requirement.

An additional experiment was performed in which batches of maf2 andwild-type seedlings were subjected to a cold treatment for a period ofonly 10 days. Following this treatment, the maf2 population floweredproportionally much earlier than in any of our previous experiments (seeexperiment 3, Table 15), with a mean total leaf number of 111.1+/−0.7versus 19.0+/−0.8 in wild-type. To confirm this result, batches of maf2and wild type plants were given a range of different cold-treatments: 3,6, 10, 15, 21, and 85 days (see FIG. 10 and Experiment 4, Table 15).maf2 plants given intermediate cold-periods of 10, 15, 21 days (whichtime is not sufficiently long to elicit a full vernalization in thewild-type) showed a strong response and flowered disproportionatelyearly. Thus, a specific role of MAF2 appeared to be the repression ofpremature vernalization in response to brief cold spells.

To test whether the observed acceleration of flowering in the maf2mutant was accompanied by a decline in FLC levels RNA was extracted fromwhole seedlings at 10 days following the cold treatments. RT-PCRexperiments showed that FLC levels declined progressively in relation tothe duration of the cold treatment (see FIG. 10) and which confirmed theresults of Sheldon et al. ((2000) supra). However, for each of thetime-points there were no clear discernible differences in FLC levelsbetween maf2 and the wild-type controls (see FIG. 10). Thus, thepremature vernalization response in the maf2 seedlings was apparentlyinduced independently of changes in FLC transcription.

In both the maf2 and the wild-type samples, it was observed that by 10day cold time-point, FLC levels had already fallen very substantiallywhen compared to the 3 and 6 day time points (see FIG. 10D). However,despite this decline in FLC levels, there was very little difference inthe flowering time of wild-type plants that had received 10, 15, or 21days of cold compared to those that had been given 6 days of cold. Avery marked reduction in flowering time was seen only in the wild typeplants that had been given the extensive 85 day cold treatment. Bycontrast, in the maf2 mutant, a 10-day cold treatment substantiallyaccelerated flowering, and a 21-day cold treatment produced anequivalent effect to that caused by an 85 days of cold treatment. RT-PCRwith MAF2 specific primers was then performed. MAF2 transcript wasabsent from the maf2 mutant. However, in contrast to FLC, MAF2 levels inthe wild type were identical between the 3, 6, 10, 15, and 21-day timepoints. However, MAF2 expression had declined in the 85 day coldtreatment sample, for which a pronounced reduction in flowering time wasobserved. These data suggested that MAF2 expression compensated for theapparent decline in FLC levels caused by the 10, 15, or 21 day coldtreatments, and thereby have maintained flowering at the same time asfor the 6 day cold treatment in wild type.

Example XXI Overexpression of MAF2, MAF3, MAF4, or MAF5, ModifiesFlowering Time

To further investigate the role of MAF2 as a floral repressor, anddetermine whether At5g65060, At5g65070, and At5g65080 could also affectflowering time, transgenic Arabidopsis lines containing the full-lengthcDNA of each of the genes expressed from the 35SCaMV promoter wereanalyzed.

Overexpression lines for each of the FLCIMAF1 paralogs (At5g65060,At5g65070, and At5g65080) displayed various alterations in the timing offlowering compared to wild-type control plants (see Tables 16 and 17).Flowering time was monitored in primary transformants and/or in a numberof independent lines in the second generation, and the results aredescribed in detail below. Accordingly, the At5g65060, At5g65070, andAt5g65080 loci were designated as MADS AFFECTING FLOWERING 3, 4, and 5(MAF3, (SEQ ID NO: 943); MAF4, (SEQ ID NO: 945); and MAF5, (SEQ ID NO:947)), respectively.

Example XXII Overexpression of MAF2 (SEQ ID NO: 567) Produces SimilarPhenotypes to Overexpression of FLC or MAF1 in the Columbia andStockholm backgrounds

Two separate batches of primary 35S:MAF2 primary transformants in theColumbia ecotype were examined under conditions of continuous light. Inthe two experiments, approximately half of lines flowered early,displaying visible flower buds around a week earlier, and producingsignificantly fewer leaves than control plants lacking the transgene(see Table 16). In batch 1 (see experiment 5) 3 of 20 T1 plants floweredwithin the wild-type range; in batch 2 (see experiment 6) 10 of 19 Tiplants flowered within the wild-type range. However, in both sets ofplants, a small proportion of the lines flowered distinctly later thanwild type.

The progeny of two late flowering 35S:MAF2 lines and two early flowering35S:MAF2 Columbia lines were examined in the T2 generation (seeexperiment 8, Table 17). All individuals from the two late lines (lineT2-16 and line T2-24) also flowered markedly later than wild-typecontrols in the T2 generation, under conditions of either 24 or 12 hourslight. In contrast, the phenotype of the early flowering lines was lessconsistent between generations. Under continuous light conditions, nosignificant difference in flowering time was observed in terms of eitherdays to visible flower bud, or total leaf number. Under the lessinductive conditions of 12 hours light, however, a very marginalacceleration of flowering was noted. Thus, the most consistent effect ofMAF2 overexpression, between generations, was a delay in flowering, eventhough the majority of lines were early flowering as primarytransformants. Semi-quantitative RT-PCR studies performed on rosetteleaves from the Ti plants revealed that the late flowering lines hadhigher levels of MAF2 overexpression than the early flowering lines.

The effects of MAF2 (SEQ ID NO: 567) overexpression were also testedusing the late flowering Stockholm accession (which has significantlyhigher levels of FLC than Columbia accession, Sheldon et al. (1999)supra; Michaels and Amasino (1999) supra; Ratcliffe et al. (2001)supra). Similar data were obtained to those from the Columbia accessionwere obtained with the Stockholm lines: 6 of 13 lines flowered early, 6of 13 lines flowered within the wild-type range, and a single lineflowered 2-3 weeks late (see experiment 7, Table 16). Therefore, theeffects of MAF2 overexpression were the same irrespective of FLC levels,with a substantial number of lines flowering early and a minority oflines flowering late.

The MAF2 overexpression data paralleled those that had been describedfor FLC or MAF1. Although FLC and MAF1 had been established to berepressors of flowering (Sheldon et al. (1999) supra; Michaels andAmasino (1999) supra; Ratcliffe et al. (2001) supra; Scortecci et al.(2001) supra), FLC and MAF1 induce flowering in a high proportion oflines, when overexpressed in accessions other than Landsberg. In theLandsberg accession, FLC or MAF1 overexpression produced late flowering,but no early flowering lines were noted (Michaels and Amasino (1999)supra; Ratcliffe et al. (2001) supra; Scortecci et al. (2001) supra).Comparable data were acquired when the 35S:MAF2 construct wastransformed into the Landsberg accession; only late flowering lines wereobtained (see experiment 11, Table 12). The MAF2 overexpression data,combined with the acceleration of flowering observed in the maf2 mutant,indicated that MAF2 is a repressor of flowering.

Example XXIII MAF2 (SEQ ID NO: 567) Prevents Vernalization Independentlyof FLC Transcription, but Represses SOC1 Expression

To examine whether MAF2 (SEQ ID NO: 567) could delay or preventvernalization, 35S:MAF2 seedlings were tested to evaluate the responseto extensive cold treatments (see FIG. 16 and Table 17, experiments 9and 10). Two separate studies were performed; in the first instance,batches of 35S:MAF2 (T2 from the late flowering line 16; Tl-16) andwild-type germinating seedlings were placed in a cold room at 4° C. fora period of 76 days, and then transferred to a growth room (12-hourphotoperiod, 20-25° C.) alongside a freshly sown non-treated batch. Inthe repeat experiment, seedlings were cold-treated for 56 days and thentransferred to a second growth room (24 hours light, 20-25° C.). In bothexperiments, 35S:MAF2 plants were completely unresponsive to the coldtreatment and flowered as late as the non-vernalized specimens grownalongside. The wild-type control plants (and wild-type segregants fromthe 35S:MAF2 population) verified the effectiveness of the vernalizationtreatment; in both the experiments cold-treated wild-type plantsflowered significantly earlier than non-treated individuals.

To determine if this absence of a vernalization response was due to MAF2preventing a fall in FLC transcript levels following cold-treatments,RT-PCR experiments were performed on cold treated and non-treated35S:MAF2 seedlings that had been harvested at 8 days following transferinto the growth room. Although the cold-treated 35S:MAF2 seedlings wereas late flowering as their non-treated siblings, FLC transcriptabundance was greatly reduced to levels similar to those found invernalized wild-type plants (see FIG. 11).

FIG. 11 shows the effects of MAF2 (SEQ ID NO: 567) overexpression in theColumbia accession.

FIG. 11A shows that 35S:MAF2 plants were late flowering and did notrespond to vernalization. Plants are shown after approximately 50 daysunder a 12-hour photoperiod, following a 10 week 4° C. cold-treatment onthe imbibed seeds. FIG. 11B shows expression of FLC, MAF2, and SOC1 inwild type Columbia, 35S:MAF2, and 35S:FLC seedlings. RNA was extractedfrom pools of ten whole seedlings after 10 days growth under 12-hourlight, following either a 3-day cold stratification (-, non-vernalized)or a 76-day vernalization treatment (+, vernalized). Expression wasmonitored by RT-PCR (25 cycles). Note that no significant changes in FLClevels are apparent in the 35S:MAF2 samples relative to the wild-typecontrol, and the SOC1 levels are repressed in the 35S:MAF2 plants in acomparable manner to the 35S:FLC samples.

Levels of FLC transcript in the non-vernalized 35S:MAF2 plants were alsomuch lower than in 35S:FLC controls. Thus, although MAF2 is capable ofpreventing a premature vernalization response, it cannot inhibit theeventual depletion of FLC transcript by long cold-treatments.Furthermore, the fact that cold-treated 35S:MAF2 plants were lateflowering, despite containing undetectable levels of FLC, indicated thatMAF2 could repress flowering via pathways independent or downstream ofFLC.

A major target of repression by FLC is the MADS-box gene SOC1 (Michaelsand Amasino (2001) supra). To test whether the mechanism by which MAF2(SEQ ID NOs: 567 and 568) can also influence downstream targets of theFLC repression pathway, SOC1 expression levels were examined in the35S:MAF2 seedlings (see FIG. 11). SOC1 expression levels were extremelylow, compared to wild type Columbia plants, in both vernalized andnon-vernalized 35S:MAF2 plants. Thus, MAF2 overexpression was sufficientto maintain repression of SOC1, even when repression by FLC had beenreduced via extensive cold treatments.

Example XXIV Effects of MAF3 (SEQ ID NO: 943), MAF4 (SEQ ID NO: 945),and MAF5 (SEQ ID NO: 947) Overexpression

Given that MAF2 (like FLC and MAF1) acts as a repressor of flowering,the remaining three MAF transcription factors, MAF3 (SEQ ID NO: 943),MAF4 (SEQ ID NO: 945), and MAF5 (SEQ ID NO: 947) were tested to identifyif there were similar repression effects on flowering time whenoverexpressed in a transgenic plant.

1. Overexpression of the MAF3 (SEQ ID NO: 943) Accelerates Flowering inthe Columbia and Stockholm, but Delays Flowering in the LandsbergEcotype

In the case of MAF3, Columbia lines overexpressing MAF3 (SEQ ID NO: 943)displayed accelerated flowering. In two separate experiments, fourteenout of a total of eighteen 35S:MAF3 primary transformants (Columbiaaccession) displayed flower buds several earlier than wild-type plantsunder continuous light conditions (see experiments 5 and 6, Table 16).In contrast, no late flowering MAF3 overexpression lines were obtainedin the Columbia accession, and the remaining transformants flowered at awild-type time. The progeny of three early flowering MAF3 overexpressionlines were also examined and these also flowered sooner than wild-typecontrols in the T2 generation (see experiment 8, Table 17). Comparableresults were obtained upon overexpression of MAF3 in the Stockholmaccession (see experiment 7, Table 16). Ten out of 20 lines (all primarytransformants) flowered earlier than controls. Of the remaining lines,seven flowered at a wild-type time and three lines were scored asflowering marginally late. However, this delay was not significantlydifferent from wild type in terms of total leaf number. Given that nosubstantially late flowering overexpression lines were obtained inColumbia or Stockholm, MAF3 was observed to have contrasting effects toFLC, MAF1, and MAF2. Nevertheless, a number of 35S:MAF3 Landsbergaccession lines were clearly late flowering, particularly undernon-inductive photoperiodic conditions of 12-hour light (see experiment11, Table 17). Thus, either overexpression levels of MAF3 in theColumbia and Stockholm accession lines were not sufficiently high enoughto cause severe repression, or MAF3 or MAF3 protein requires a partnerto act as a repressor in those accessions, but alone is sufficient toact as a repressor in Landsberg accession.

2. Overexpression of MAF4 (SEQ ID NO: 945) Modifies Flowering Time butProduces Deleterious Effects on Growth and Development

For unknown reasons, transcriptional overexpression of the third gene inthe Arabidopsis chromosomal cluster, MAF4 (SEQ ID NO: 945), producedmany pleiotropic effects on growth and development. 35S:MAF4 lines wereoften very small, stunted, displayed a slow growth rate, and formedpoorly developed inflorescences that set relatively few seeds.Additionally, some lines exhibited premature senescence of rosetteleaves, whilst others arrested growth during the seedling stage andsenesced without flowering. Despite these deleterious effects however,alterations in flowering time could still be observed. As describedabove, approximately half of the MAF3 Columbia overexpression linesflowered earlier than controls (see experiments 5 and 6, Table 16). Incontrast, a single 35S:MAF4 Columbia T1 plant was late flowering. Anumber of late flowering lines were also obtained in the Stockholm andLandsberg accessions, showing that the gene can repress flowering (seeexperiment 7, Table 16, and experiment 11, Table 17, respectively).

3. Effects of MAF5 (SEQ ID NO: 947) Overexpression

Overexpression of the fourth gene in the Arabidopsis chromosomalcluster, MAF5 (SEQ ID NO: 947), produced somewhat inconsistent effectson flowering time (see Tables 16 and 17). In a first study (seeexperiment 5, Table 16), five of ten 35S:MAF5 Columbia transformantsflowered significantly earlier than wild-type controls, whereas theremainder flowered at a wild-type time. In a second study (seeexperiment 6, Table 16), in which a larger number of lines wereexamined, only three of nineteen lines flowered early. Fifteen of theremaining lines showed a wild-type flowering time whilst a single plantflowered slightly late, making a slightly larger number of leaves thancontrols, but displaying flower buds at the same date. The progeny ofthree early flowering 35S:MAF5 lines were examined in the T2 generationin both continuous light and a 12 hour photoperiod, but neither showed aconsistent difference in flowering time from wild-type (see experiment8, Table 17). Thus, although MAF5 appeared to promote flowering inColumbia accession, such effects varied between experiments, generationsand genetic backgrounds suggesting that this result was conditional onunresolved variables.

The 35S:MAF5 transgene was also introduced into Stockholm accession (seeexperiment 7, Table 16). In this experiment, all except two of the linesflowered within the wild-type range. The remaining two lines wereobserved to flower marginally later than wild type (either a leaf or aday later). As with MAF2-4, delayed flowering occurred when MAF5 wasoverexpressed in Landsberg accession, however such effects were muchless marked than for MAF2-4. Thus, although the Landsberg resultsindicated that MAF5 could delay flowering, the floral repressionactivity of MAF5 appeared less extreme than for MAF2, MAF3 and MAF4.

Example XXV Endogenous Expression of MAF3 (SEQ ID NO: 943) and MAF4 (SEQID NO: 945) Repressed by Vernalization; Endogenous Expression of MAF5(SEQ ID NO: 947) Up-Regulated by Vernalization

The mutant analysis of maf2 along with the overexpression studies onMAF2-5 (SEQ ID NOs: 567, 943, 945, and 947, respectively) demonstratedthat each of the genes could influence flowering time, and that MAF2(SEQ ID NO: 567) prevents premature vernalization. Using RT-PCRanalysis, it was observed that all of these genes were expressed acrossa wide range of tissue types (data not shown) similarly to that whichhad been described for FLC and MAF1 (Sheldon et al. (1999) supra;Michaels and Amasino (1999) supra; Alvarez-Buylla (2000b) Plant J. 24:457-466; Ratcliffe et al. (2001) supra; Scortecci et al. (2001) supra).A key feature of the mechanism by which FLC acts is that FLC transcriptand protein levels decrease in response to long cold treatments of 4-6weeks, thereby allowing the floral transition to occur (Sheldon et al.(1999) supra; Michaels and Amasino (1999) supra; Sheldon et al. (2000)supra; Rouse et al. (2002) supra). The expression levels of MAF1 arealso affected by vernalization in certain genetic backgrounds (Ratcliffeet al. (2001) supra).

To examine whether expression levels of endogenous MAF2-5 (SEQ ID NOs:567, 943, 945, and 947, respectively) were also influenced byvernalization, expression of each of the genes were compared betweenvernalized and non-vernalized seedlings by RT-PCR using gene specificprimers (see FIG. 8). FIG. 8 shows the effects of vernalization onexpression of MAF2-5 (SEQ ID NOs: 567, 943, 945, and 947, respectively)in different genetic (accession) backgrounds.

RNA was extracted from pools of 10-20, 8-day old seedlings grown undercontinuous light conditions. Expression was monitored by RT-PCR (MAF1-5(SEQ ID NOs: 9, 567, 943, 945, and 947, respectively), and FLC (SEQ IDNO: 1874), 30 cycles; actin, 25 cycles). Vernalized (+) samples werecold-treated for 6 weeks at 4° C., whereas non-vernalized (−) sampleswere stratified for only 3 days at 4° C. as imbibed seeds. Col=Columbia,Pi-0=Pitztal, St-0=Stockholm, fca=fca-9 mutant. Note that FLC levels arelower in vernalized than non-vernalized samples, confirming the efficacyof the treatment. MAF2 (SEQ ID NO: 567) levels are similar betweenvernalized and non-vernalized seedlings from all backgrounds. MAF3 (SEQID NO: 943) and MAF4 (SEQ ID NO: 945) transcript levels are lower invernalized compared to non-vernalized samples for each of the differentbackgrounds. MAF5 (SEQ ID NO: 947) transcript levels, in contrast toMAF1-5 and FLC, were elevated in vernalized compared to non-vernalizedsamples for each of the different backgrounds.

Germinating seeds from a number of genetic backgrounds were vernalizedin a cold room for a period of 6 weeks and then transferred to a growthchamber along with freshly sown non-vernalized controls. After 8 days incontinuous light conditions, whole seedlings were harvested and RNA wasextracted. FLC transcript levels were substantially higher innon-vernalized versus vernalized seedlings, in all of the backgrounds,confirming the efficacy of the treatment. In this experiment, MAF2 (SEQID NO: 567) transcripts displayed no clear consistent differences inexpression level between non-vernalized plants and those given a 6 weekcold treatment. However, in other experiments, MAF2 transcript levelswere eventually reduced following excessively long cold treatments, ofup to 10-12 weeks (see FIGS. 10 and 11). Both MAF3 (SEQ ID NO: 943) andMAF4 (SEQ ID NO: 945) appeared responsive to a 6-week vernalization, andtheir expression paralleled that of FLC; transcript levels of MAF3 andMAF4 appeared somewhat elevated in fca and Stockholm compared toColumbia seedlings; in all backgrounds, transcript levels were markedlylower in vernalized compared to non-vernalized samples.

MAF5 (SEQ ID NO: 947) showed an opposite endogenous expression patterncompared with FLC. In all of the genetic backgrounds, endogenous MAF5transcript levels were low in non-vernalized samples and became elevatedin seedlings that had been vernalized. Thus, although the MAF2-5 genesare arranged in a very tight cluster on the Arabidopsis chromosome,their expression appears to be under distinct modes of transcriptionalregulation, yet remain involved in the plant's response to coldtreatment.

FIG. 9 is a schematic diagram summarizing the observed responses of FLCand MAF1-5 (SEQ ID NOs: 1874, 1734, 567, 943, 945, and 947,respectively) to vernalization and their potential effects on the floraltransition. Arrows indicate positive interactions, blunt-ended linesdenote inhibition.

Example XXVI Identification of Homologous Sequences

This example describes identification of genes that are orthologous toArabidopsis thaliana MAF transcription factors from a computer homologysearch.

Homologous sequences, including those of paralogs and orthologs fromArabidopsis and other plant species, were identified using databasesequence search tools, such as the Basic Local Alignment Search Tool(BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; Altschul etal. (1997) Nucleic Acid Res. 25: 3389-3402). The tblastx sequenceanalysis programs were employed using the BLOSUM62 scoring matrix(Henikoff and Henikoff (1992) Proc. Natl. Acad. Sci. 89: 10915-10919).The entire NCBI GenBank database was filtered for sequences from allplants except Arabidopsis thaliana by selecting all entries in the NCBIGenBank database associated with NCBI taxonomic ID 33090 (Viridiplantae;all plants) and excluding entries associated with taxonomic ID 3701(Arabidopsis thaliana).

These sequences were compared to sequences representing genes of SEQ IDsNOs: 568, 944, 946, 948, 1735, 1875, 1971, and 1973, using theWashington University TBLASTX algorithm (version 2.0a19 MP) at thedefault settings using gapped alignments with the filter “off”. For eachgene of SEQ IDs NOs: 568, 944, 946, 948, 1735, and 1875, individualcomparisons were ordered by probability score (Pvalue), where the scorereflects the probability that a particular alignment occurred by chance.For example, a score of 3.6e-40 is 3.6×10-40. In addition to P-values,comparisons were also scored by percentage identity. Percentage identityreflects the degree to which two segments of DNA or protein areidentical over a particular length. Examples of sequences so identifiedare presented in Table 7 and Table 9. Paralogous or orthologoussequences were readily identified and available in GenBank by Accessionnumber (Table 7; Test sequence ID). The percent sequence identity amongthese sequences can be as low as 47%, or even lower sequence identity.

In addition, the sequences representing genes of SEQ IDs NOs: 568, 944,946, 948, 1735, and 1875, were compared using the Washington UniversityTBLASTX algorithm, as described above, to sequences in a proprietarydatabase comprising polynucleotide sequences isolated from soy. In thismanner, SEQ ID NOs: 1014, 1971, and 1973 from soy were identified asputative orthologs and the soy sequences were then compared as querieswith a proprietary Arabidopsis transcription factor database using theWashington University TBLASTX algorithm, as described above (reciprocalBLAST). In each case, the subject sequence with the highest probabilityscore was one of SEQ IDs NOs: 568, 944, 946, 948, 1735, and 1875,thereby confirming that SEQ ID NOs: 1014, 1971, and 1973 are, morelikely than not, soy MAF transcription factor orthologs of theArabidopsis MAF transcription factors.

Candidate paralogous sequences were identified among Arabidopsis MAFtranscription factors through alignment, identity, and phylogenicrelationships. A list of paralogs is shown in Table 9. Candidateorthologous sequences were identified from proprietary unigene sets ofplant gene sequences in Zea mays, Glycine max, and Oryza sativa based onsignificant homology to Arabidopsis MAF transcription factors. Thesecandidates were reciprocally compared to the set of Arabidopsis MAFtranscription factors. If the candidate showed maximal similarity in theprotein domain to the eliciting MAF transcription factor or to a paralogof the eliciting MAF transcription factor, then it was considered to bean ortholog. Identified non-Arabidopsis sequences that were shown inthis manner to be orthologous to the Arabidopsis sequences are providedin Table 7.

Example XXVII Introduction of MAF Polynucleotides into DicotyledonousPlants

SEQ ID NO: 567, SEQ ID NO: 943, SEQ ID NO: 945, SEQ ID NO: 947, SEQ IDNO: 1734, SEQ ID NO: 1874, SEQ ID NO: 1014, SEQ ID NO: 1970, SEQ ID NO:1972, SEQ ID NO: 1944, SEQ ID NO: 1946, SEQ ID NO: 1948, SEQ ID NO:1950, SEQ ID NO: 1952, SEQ ID NO: 1954, SEQ ID NO: 1956, SEQ ID NO:1958, SEQ ID NO: 1960, SEQ ID NO: 1962, SEQ ID NO: 1964, SEQ ID NO:1966, SEQ ID NO: 1968, SEQ ID NO: 1970, SEQ ID NO: 172, paralogous,orthologous, and homologous sequences recombined into pMEN20 or pMEN65expression vectors are transformed into a plant for the purpose ofmodifying plant traits. The cloning vector may be introduced into avariety of cereal plants by means well-known in the art such as, forexample, direct DNA transfer or Agrobacterium tumefaciens-mediatedtransformation. It is now routine to produce transgenic plants usingmost dicot plants (see Weissbach and Weissbach (1989) supra; Gelvin etal. (1990) supra; Herrera-Estrella et al. (1983) supra; Bevan (1984)supra; and Klee (1985) supra). Methods for analysis of traits areroutine in the art and examples are disclosed above.

All references, publications, patent documents, web pages, and otherdocuments cited or mentioned herein are hereby incorporated by referencein their entirety for all purposes. Although the invention has beendescribed with reference to specific embodiments and examples, it shouldbe understood that one of ordinary skill can make various modificationswithout departing from the spirit of the invention. The scope of theinvention is not limited to the specific embodiments and examplesprovided. TABLE 14 Polypeptide SEQ ID NO 1971 1973 1875 1735 568 944 946948 Name SOY SOY MADS1 MADS3 FLC MAF1 MAF2 MAF3 MAF4 MAF5 Gene ID Soy1Soy3 G1759 G157 G859 G1842 G1843 G1844 1971 SOY MADS1 Soy1 100%  (100%) 1973 SOY MADS3 Soy3 77% 100%  (83%) (100%)  1875 FLC G1759 37% 40% 100% (53%) (58%) (100%)  1735 MAF1 G157 33% 36% 65% 100%  (53%) (56%) (79%)(100%)  568 MAF2 G859 34% 34% 61% 76% 100%  (52%) (53%) (74%) (85%)(100%)  944 MAF3 G1842 32% 34% 60% 78% 87% 100%  (43%) (56%) (75%) (87%)(90%) (100%)  946 MAF4 G1843 35% 36% 57% 65% 64% 64% 100% (53%) (55%)(74%) (81%) (77%) (78%) (100%) 948 MAF5 G1844 36% 36% 53% 63% 63% 65% 71% 100% (54%) (56%) (73%) (80%) (79%) (81%)  (83%) (100%)

TABLE 15 Flowering time of the maf2 mutant (a) Days of cold PhotoperiodDays to visible Genotype N treatment (b) (hours) flower bud Total leafnumber (c) Experiment 1 maf2 20 3 8 76-94 (81.1 +/− 2.7) ND wild type 113 8 76-103 (89.8 +/− 5.7) ND maf2 24 3 12 28-42 (35.7 +/− 1.5)  7-34(23.1 +/− 2.5) wild type 23 3 12 20-46 (39.6 +/− 3.1)  6-37 (27.6 +/−3.5) maf2 24 3 24 17-20 (18.1 +/− 0.3) 10-13 (11.8 +/− 0.4) wild type 243 24 16-23 (20.0 +/− 0.7)  9-16 (13.0 +/− 0.7) Experiment 2 maf2 22 5212 17-28 (19.8 +/− 1.2)  7-12 (8.5 +/− 0.4) maf2 24 3 12 25-35 (28.7 +/−0.9)  9-21 (16.0 +/− 1.7) wild type 33 52 12 15-33 (23.0 +/− 1.7)  8-14(10.2 +/− 0.5) wild type 36 3 12 22-43 (34.7 +/− 1.6)  6-29 (21.5 +/−1.7) Experiment 3 maf2 24 10 12 ND  8-15 (11.1 +/− 0.7) wild type 24 1012 ND 16-22 (19.0 +/− 0.8) Experiment 4 maf2 12 3 12 21-43 (34.8 +/−4.1) ND wild type 12 3 12 34-48 (44.0 +/− 2.9) ND maf2 12 6 12 22-46(33.3 +/− 5.9) ND wild type 12 6 12 22-48 (39.9 +/− 6.1) ND maf2 12 1012 19-32 (26.5 +/− 3.0) ND wild type 12 10 12 26-48 (39.3 +/− 3.0) NDmaf2 12 15 12 18-32 (24.2 +/− 2.4) ND wild type 12 15 12 34-43 (38.5 +/−1.9) ND maf2 12 21 12 16-21 (18.0 +/− 1.1)  8-11 (8.5 +/− 0.6) wild type12 21 12 33-43 (36.9 +/− 1.7) ND maf2 12 85 12 15-20 (18.3 +/− 0.9)  7-9(7.9 +/− 0.6) wild type 11 85 12 14-19 (16.4 +/− 1.3)  7-9 (7.5 +/− 0.6)Notes:(a) Range of values obtained followed by mean +/− Standard Error with95% confidence limitsattached (parentheses)(b) Duration of cold treatment on imbibed seeds at 4 degrees C., beforetransfer to growth room(c) Number of leaf primordia produced by primary shoot meristem beforefirst flower.N = number of plants in populationwild type = non-transformed Columbia plantsND = Not determined

TABLE 16 Flowering time of 35S:MAF2-5 Stockholm and Columbia T1 lines(a) Pene- Days to visible Genotype Phenotype (b, c) trance flower budTotal leaf number Experiment 5 Columbia ecotype Control (d) wild-type43/43  17-25 (21.1 +/− 0.5) 9-18 (14.3 +/− 0.7) 35S:MAF2 early 13/20 11-15 (12.8 +/− 0.7) 5-8 (6.2 +/− 0.6) slightly early 1/20 15-15 (15.0+/− ND) 9-9 (9.0 +/− ND) wild-type 3/20 21-22 (21.3 +/− 1.4) 10-13 (11.3+/− 3.8) slightly late 2/20 27-27 (27.0 +/− ND) 14-16 (15.0 +/− ND) late1/20 27-27 (27.0 +/− ND) 22-22 (22.0 +/− ND) 35S:MAF3 early 6/10 13-14(13.5 +/− 0.6) 7-8 (7.7 +/− 0.5) slightly early 1/10 16-16 (16.0 +/− ND)9-9 (9.0 +/− ND) wild-type 3/10 18-21 (19.3 +/− 3.8) 10-14 (12.0 +/−5.0) 35S:MAF4 (e) early 5/13 13-16 (15.2 +/− 1.6) 6-7 (6.2 +/− 0.6 )slightly early (f) 3/13 18-27 (21.0 +/− 12.9) 7-8 (7.3 +/− 1.4)wild-type 2/13 23-25 (24.0 +/− ND) 10-15 (12.5 +/− ND) slightly late (f)1/13 70-70 (70.0 +/− ND) 15-15 (15.0 +/− ND) arrested growth (g) 2/13 NDND 35S:MAF5 early 5/10 14-15 (14.2 +/− 0.6) 6-7 (6.2 +/− 0.6) wild-type5/10 17-25 (20.8 +/− 4.0) 9-16 (12.6 +/− 3.8) Experiment 6 Columbiaecotype Control (h) wild-type 15/15  21-30 (25.3 +/− 1.3) 9-15 (11.5 +/−1.1) 35S:MAF2 early 5/19 15-20 (18.0 +/− 2.6) 7-8 (7.4 +/− 0.7) slightlyearly 2/19 22-23 (22.5 +/− ND) 8-8 (8.0 +/− ND) wild-type 10/19  22-28(23.5 +/− 1.3) 9-14 (11.1 +/− 1.0) late 2/19 35-42 (38.5 +/− ND) 26-29(27.5 +/− ND) 35S:MAF3 early 5/8  16-20 (17.6 +/− 2.9) 7-8 (7.4 +/− 0.7)slightly early 2/8  17-17 (17.0 +/− ND) 9-11 (10.0 +/− ND) wild-type1/8  21-21 (21.0 +/− ND) 11-11 (11.0 +/− ND) 35S:MAF4 (e) slightly early2/7  20-27 (23.5 +/− ND) 6-10 (8.0 +/− ND) wild-type 5/7  24-30 (27.0+/− 3.5) 10-11 (10.3+/− 0.8) 35S:MAF5 early 1/19 20-20 (20.0 +/− ND) 8-8(8.0 +/− ND) slightly early 2/19 20-20 (20.0 +/− ND) 11-13 (12.0 +/− ND)wild-type 15/19  21-27 (23.9 +/− 0.9) 9-15 (12.6 +/− 1.2) slightly late1/19 27-27 (27.0 +/− ND) 18-18 (18.0 +/− ND) Experiment 7 Stockholmecotype Control (i) wild-type 23/23  32-42 (37.6 +/− 1.0) 30-42 (36.1+/− 1.4) 35S:MAF2 early 1/13 31-31 (31.0 +/− ND) 17-17 (17.0 +/− ND)slightly early 5/13 32-34 (32.8 +/− 1.0) 17-27 (21.4 +/− 5.4) wild-type6/13 36-40 (37.5 +/− 1.6) 30-38 (33.3 +/− 2.8) late 1/13 55-55 (55 +/−ND) 44-44 (44 +/− ND) 35S:MAF3 early 5/20 21-31 (26.8 +/− 5.2) 12-25(17.6 +/− 7.4) slightly early 5/20 32-36 (33.8 +/− 2.0) 23-29 (26.6 +/−3.1) wild-type 7/20 34-42 (38.9 +/− 2.7) 30-41 (36.6 +/− 3.5) slightlylate 3/20 39-43 (41.7 +/− 5.7) 31-43 (36.0 +/− 15.5) 35S:MAF4 (e)slightly early 1/12 41-41 (41.0 +/− ND) 29-29 (29.0 +/− ND) wild-type2/12 38-41 (39.5 +/− ND) 33-33 (33.0 +/− ND) slightly late 7/12 42-45(43.9 +/− 1.1) 35-43 (36.8 +/− 5.1) arrested growth (g) 2/12 ND ND35S:MAF5 wild-type 16/18  35-41 (37.9 +/− 0.9) 32-41 (36.8 +/− 1.7)slightly late 2/18 38-42 (40.0 +/− ND) 43-44 (43.5 +/− ND)Notes:Except where otherwise indicated, transgenic plants were selected MSagar plates containing kanamycin at 50 mg/l(a) Range of values obtained followed by mean +/− Standard Error with95% confidence limits attached (parentheses) are shown for each class.(b) Plants classified as early or late, flowered outside the wild typerange in terms of both days to first open flower and total leaf number.(c) Plants classified as slightly early or slightly late floweredoutside wild type range in terms of either days to first open flower ortotal leaf number.(d) Control is wild type Columbia.(e) The majority of 35S:MAF4 lines were small and stunted, and someshowed(f) a slow rate of growth(g) These plants senesced and died at the seedling stage withoutproducing flower buds.(h) Controls for experiment 6 were Columbia T2 transformants from a mixof lines, containing the ‘empty’ transformation vector, and selected onkanamycin plates.(i) Controls for experiment 7 were Stockholm T1 (primary) transformantscontaining the ‘empty’ transformation vector.Penetrance = number of plants out of population showing phenotypePlants grown under 24 hours light at 20-25 degrees Celsius in all threeexperiments.ND = not determined

TABLE 17 Flowering time phenotypes of 35S:MAF2-5 T2 Columbia andLandsberg lines (a) Days of Photo- T1 T2 cold treat- period Days tovisible Genotype Line N phenotype phenotype ment (b) (hours) flower budTotal leaf number Experiment 8 (Columbia) 35S:MAF2 T2-16 13 late late 324 25-39 (32.5 +/− 2.1) 17-35 (25.9 +/− 3.5) 35S:MAF2 T2-24 16 late late3 24 32-38 (34.9 +/− 1.2) 25-34 (29.6 +/− 1.4) 35S:MAF2 T2-32 20 earlywild-type 3 24 21-23 (21.5 +/− 0.4) 10-18 (14.4 +/− 0.9) 35S:MAF2 T2-3615 early wild-type 3 24 18-21 (20.8 +/− 0.4) 9-14 (12.1 +/− 0.9)35S:MAF3 T2-2 17 early early 3 24 16-20 (17.4 +/− 0.7) 8-14 (9.8 +/−0.8) 35S:MAF3 T2-9 16 early early 3 24 17-21 (19.1 +/− 0.7) 8-15 (11.5+/− 1.0) 35S:MAF4 (c) T2-4 11 slightly early ND 3 24 26-32 (29.0 +/−1.4) 11-16 (13.4 +/− 1.2) 35S:MAF4 (d) T2-11 9 slightly early wild-type3 24 17-25 (22.1 +/− 1.7) 7-15 (10.7 +/− 2.5) 35S:MAF5 T2-10 19 earlywild-type 3 24 20-23 (21.6 +/− 0.5) 13-17 (14.7 +/− 0.6) 35S:MAF5 T2-2516 early wild-type 3 24 21-25 (22.0 +/− 0.7) 11-18 (13.6 +/− 0.9)35S:MAF5 T2-7 14 early wild-type 3 24 21-21 (21.0 +/− 0) 11-16 (13.4 +/−1.0) Control (e) NA 47 wild-type wild-type 3 24 18-23 (21.0 +/− 0.3)10-18 (13.8 +/− 0.6) 35S:MAF2 T2-16 10 late late 3 12 59-68 (63.8 +/−3.0) 38-49 (44.0 +/− 2.4) 35S:MAF2 T2-24 17 late late 3 12 52-66 (59.4+/− 2.0) 31-54 (46.2 +/− 3.3) 35S:MAF2 T2-32 17 early slightly early 312 28-40 (34.3 +/− 1.9) 11-24 (17.0 +/− 1.9) 35S:MAF2 T2-36 14 earlyslightly early 3 12 22-46 (33.3 +/− 3.8) 7-23 (14.3 +/− 3.1) 35S:MAF3T2-2 14 early early 3 12 21-39 (27.1 +/− 2.7) 7-18 (10.7 +/− 1.8)35S:MAF3 T2-9 18 early early 3 12 21-42 (31.1 +/− 3.2) 7-24 (13.7 +/−2.1) 35S:MAF3 T2-10 19 early slightly early 3 12 28-42 (35.5 +/− 1.9)10-29 (17.0 +/− 2.1) 35S:MAF4 (c) T2-4 20 slightly early ND 3 12 ND ND35S:MAF4 (d) T2-11 8 slightly early ND 3 12 22-49 (30.3 +/− 10.5) 7-30(21.0 +/− 7.3) 35S:MAF5 T2-10 17 early wild-type 3 12 27-45 (39.2 +/−2.4) 16-38 (24.3 +/− 2.9) 35S:MAF5 T2-25 13 early wild-type 3 12 23-45(35.8 +/− 5.0) 9-28 (20.3 +/− 3.6) 35S:MAF5 T2-7 7 early wild-type 3 1230-40 (34.1 +/− 4.2) 11-27 (16.9 +/− 5.4) Control (e) NA 35 wild-typewild-type 3 12 21-49 (38.5 +/− 2.3) 8-32 (21.5 +/− 2.2) Experiment 9(Columbia) 35S:MAF2 T2  16 16 late late 56 24 29-37 (32.8 +/− 1.2) 21-31(25.8 +/− 1.3) 35S:MAF2 T2  16 16 late late 3 24 29-38 (32.4 +/− 1.3)24-33 (28.9 +/− 1.3) control (e) NA 16 NA NA 56 24 14-17 (15.1 +/− 0.6)7-11 (8.9 +/− 0.5) control (e) NA 16 NA NA 3 24 21-22 (21.1 +/− 0.2)13-18 (15.1 +/− 1.0) control (f) NA 16 NA NA 56 24 14-18 (14.9 +/− 0.9)7-10 (8.3 +/− 0.6) control (f) NA 16 NA NA 3 24 20-23 (21.2 +/− 0.7)13-21 (16.3 +/− 1.7) Experiment 10 (Columbia) 35S:MAF2 T2 (g)  16 10late late 76 12 ND 35-48 (40.0 +/− 3.3) 35S:MAF2 T2 (g)  16 10 late late3 12 ND 38-49 (44.0 +/− 2.4) control (f) NA 10 NA NA 76 12 16-21 (18.5+/− 1.8) 6-8 (6.9 +/− 0.6) control (f) NA 24 NA NA 3 12 24-38 (30.2 +/−1.9) 10-22 (16.2 +/− 1.4) Experiment 11 (Landsberg) 35S:MAF2 T2 207 16ND late 5 24 20-24 (21.6 +/− 0.7) ND 35S:MAF2 T2 210 16 ND late 5 2427-36 (29.9 +/− 1.2) ND 35S:MAF3 T2 102 16 ND slightly late 5 24 17-24(19.7 +/− 1.2) ND 35S:MAF3 T2 105 16 ND wild-type 5 24 17-21 (18.4 +/−0.7) ND 35S:MAF3 T2 115 16 ND late 5 24 20-25 (22.6 +/− 1.0) ND 35S:MAF3T2 116 16 ND late 5 24 20-22 (21.1 +/− 0.4) ND 35S:MAF4 T2 (d)  3 8 NDlate 5 24 22-42 (30.1 +/− 4.8) ND 35S:MAF4 T2 (d) 101 18 ND late 5 2420-28 (23.3 +/− 0.9) ND 35S:MAF4 T2 (d) 202 14 ND slightly late 5 2417-26 (19.8 +/− 1.6) ND 35S:MAF4 T2 (d) 204 12 ND late 5 24 23-28 (25.2+/− 1.3) ND 35S:MAF5 T2 101 5 ND slightly late 5 24 18-22 (20.2 +/− 1.8)ND 35S:MAF5 T2 103 16 ND slightly late 5 24 18-20 (19.3 +/− 0.5) ND35S:MAF5 T2 203 16 ND slightly late 5 24 17-20 (19.6 +/− 0.5) ND35S:MAF5 T2 204 14 ND slightly late 5 24 20-21 (20.4 +/− 0.3) ND control(f) NA 103 NA wild-type 5 24 16-20 (18.4 +/− 0.7) ND 35S:MAF2 T2 207 17ND late 5 12 21-49 (37.9 +/− 4.8) ND 35S:MAF2 T2 210 13 ND late 5 1234-80 (72.6 +/− 7.7) ND 35S:MAF3 T2 102 14 ND slightly late 5 12 21-30(24.2 +/− 1.6) ND 35S:MAF3 T2 105 16 ND slightly late 5 12 20-37 (25.1+/− 2.4) ND 35S:MAF3 T2 115 16 ND late 5 12 56-80 (66.5 +/− 4.2) ND35S:MAF3 T2 116 14 ND late 5 12 49-65 (57.2 +/− 4.1) ND 35S:MAF4 T2 (d) 3 9 ND late 5 12 49-70 (59.4 +/− 5.7) ND 35S:MAF4 T2 (d) 101 11 ND late5 12 42-69 (56.6 +/− 5.4) ND 35S:MAF4 T2 (d) 202 8 ND late 5 12 24-41(33.4 +/− 5.2) ND 35S:MAF4 T2 (d) 204 14 ND late 5 12 26-60 (47.2 +/−6.2) ND 35S:MAF5 T2 101 9 ND slightly late 5 12 21-35 (24.2 +/− 3.7) ND35S:MAF5 T2 103 16 ND slightly late 5 12 18-29 (24.4 +/− 1.9) ND35S:MAF5 T2 203 18 ND late 5 12 21-31 (25.5 +/− 1.7) ND 35S:MAF5 T2 20410 ND late 5 12 29-41 (36.6 +/− 3.3) ND control (f) NA 93 NA wild-type 512 18-34 (22.9 +/− 0.6) NDNotes:Except where otherwise indicated, transgenic plants were selected MSagar plates containing kanamycin at 50 mg/l(a) Range of values obtained followed by mean +/− Standard Error with95% confidence limits attached (brackets)(b) Duration of cold treatment on imbibed seeds at 4 degrees C., beforetransfer to growth room(c) Plants were small, produced leaves at a slower rate than wild-type,and showed premature leaf senescence.(d) Plants were distinctly small compared to wild-type.(e) Controls were Columbia T2 transformants from a mix of lines,containing the ‘empty’ transformation vector, selected on kanamycinplates.(f) Wild type, not selected on kanamycin(g) Not selected on kanamycinNA = Not applicableND = Not determinedN = number of plants in population

1. A transgenic plant comprising a recombinant polynucleotide having apolynucleotide sequence, or a complementary polynucleotide sequencethereof, selected from the group consisting of: (a) a nucleotidesequence encoding a polypeptide that initiates transcription, whereinsaid polypeptide is selected from the group consisting of SEQ ID NO: 2N,wherein N=1-480, SEQ ID NO: 2N-1, where N=857-970, or SEQ ID NO: 989,990, 991, 1001, 1002, 1012, 1018, 1021, 1022, 1025, 1027, 1028, 1029,1034,1050, 1051, 1072, 1073, 1074, 1075, 1076, 1091, 1092, 1093, 1094,1095, 1109, 1110, 1111, 1112, 1150, 1165, 1166, 1167, 1168, 1169, 1189,1190, 1191, 1197, 1198, 1199, 1213, 1214, 1215, 1216, 1226, 1227, 1233,1239, 1246, 1247, 1258, 1259, 1269, 1307, 1308, 1309, 1310, 1323, 1329,1330, 1331, 1332, 1338, 1339, 1340, 1361, 1362, 1373, 1374, 1375, 1384,1389, 1390, 1391, 1396, 1411, 1412, 1413, 1414, 1424, 1435, 1436, 1437,1448, 1456, 1457, 1458, 1459, 1460, 1472, 1483, 1484, 1500, 1508, 1510,1511, 1520, 1538, 1539, 1540, 1541, 1542, 1543, 1563, 1564, 1565, 1566,1567, 1568, 1569, 1582, 1583, 1594, 1611, 1612, 1618, 1619, 1620, 1626,1627, 1635, 1636, 1640, 1641, 1655, 1656, 1657, 1658, 1672, 1673, 1680,1682, 1686, 1687, 1688, 1689, 1696, 1702, 1703, 1945, 1947, 1949, 1951,1953, 1955, 1957, 1959, 1961, 1963, 1965, 1967, 1969, 1971, and 1973;(b) a nucleotide sequence encoding a polypeptide that initiatestranscription, wherein said nucleotide sequence is selected from thegroup consisting of SEQ ID NO: 2N-1, where N=1-480, SEQ ID NO: 2N, whereN=856-969, or SEQ ID NO: 961, 962, 963, 964, 965, 967, 968, 969, 970,971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984,985, 986, 987, 988, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1003,1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1013, 1014, 1015, 1016,1017, 1019, 1020, 1023, 1024, 1026, 1030, 1031, 1032, 1033, 1035,1036,1037, 1038, 1039, 1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048,1049, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061, 1062,1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1077, 1078, 1079,1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1096,1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108,1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124,1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136,1137, 1138, 1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148,1149, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160, 1161,1162, 1163, 1164, 1170, 1171, 1172, 1173, 1174, 1175, 1176, 1177, 1178,1179, 1180, 1181, 1182, 1183, 1184, 1185, 1186, 1187, 1188, 1192, 1193,1194, 1195, 1196, 1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208,1209, 1210, 1211, 1212, 1217, 1218, 1219, 1220, 1221, 1222, 1223, 1224,1225, 1228, 1229, 1230, 1231, 1232, 1234, 1235, 1236, 1237, 1238, 1240,1241, 1242, 1243, 1244, 1245, 1248, 1249, 1250, 1251, 1252, 1253, 1254,1255, 1256, 1257, 1260, 1261, 1262, 1263, 1264, 1265, 1266, 1267, 1268,1270, 1271, 1272, 1273, 1274, 1275, 1276, 1277, 1278, 1279, 1280, 1281,1282, 1283, 1284, 1285, 1286, 1287, 1288, 1289, 1290, 1291, 1292, 1293,1294, 1295, 1296, 1297, 1298, 1299, 1300, 1301, 1302, 1303, 1304, 1305,1306, 1311, 1312, 1313, 1314, 1315, 1316, 1317, 1318, 1319, 1320, 1321,1322, 1324, 1325, 1326, 1327, 1328, 1333, 1334, 1335, 1336, 1337, 1341,1342, 1343, 1344, 1345, 1346, 1347, 1348, 1349, 1350, 1351, 1352, 1353,1354, 1355, 1356, 1357, 1358, 1359, 1360, 1363, 1364, 1365, 1366, 1367,1368, 1369, 1370, 1371, 1372, 1376, 1377, 1378, 1379, 1380, 1381, 1382,1383, 1385, 1386, 1387, 1388, 1392, 1393, 1394, 1395, 1397, 1398, 1399,1400, 1401, 1402, 1403, 1404, 1405, 1406, 1407, 1408, 1409, 1410, 1415,1416, 1417, 1418, 1419, 1420, 1421, 1422, 1423, 1425, 1426, 1427, 1428,1429, 1430, 1431, 1432, 1433, 1434, 1438, 1439, 1440, 1441, 1442, 1443,1444, 1445, 1446, 1447, 1449, 1450, 1451, 1452, 1453, 1454, 1455, 1461,1462, 1463, 1464, 1465, 1466, 1467, 1468, 1469, 1470, 1471, 1473, 1474,1475, 1476, 1477, 1478, 1479, 1480, 1481, 1482, 1485, 1486, 1487, 1488,1489, 1490, 1491, 1492, 1493, 1494, 1495, 1496, 1497, 1498, 1499, 1501,1502, 1503, 1504, 1505, 1506, 1507, 1509, 1512, 1513, 1514, 1515, 1516,1517, 1518, 1519, 1521, 1522, 1523, 1524, 1525, 1526, 1527, 1528, 1529,1530, 1531, 1532, 1533, 1534, 1535, 1536, 1537, 1544, 1545, 1546, 1547,1548, 1549, 1550, 1551, 1552, 1553, 1554, 1555, 1556, 1557, 1558, 1559,1560, 1561, 1562, 1570, 1571, 1572, 1573, 1574, 1575, 1576, 1577, 1578,1579, 1580, 1581, 1584, 1585, 1586, 1587, 1588, 1589, 1590, 1591, 1592,1593, 1595, 1596, 1597, 1598, 1599, 1600, 1601, 1602,1603, 1604, 1605,1606, 1607, 1608, 1609, 1610, 1613, 1614, 1615, 1616, 1617, 1621, 1622,1623, 1624, 1625, 1628, 1629, 1630, 1631, 1632, 1633, 1634, 1637, 1638,1639, 1642, 1643, 1644, 1645, 1646, 1647, 1648, 1649, 1650, 1651, 1652,1653, 1654, 1659, 1660, 1661, 1662, 1663, 1664, 1665, 1666, 1667, 1668,1669, 1670, 1671, 1674, 1675, 1676, 1677, 1678, 1679, 1681, 1683, 1684,1685, 1690, 1691, 1692, 1693, 1694, 1695, 1697, 1698, 1699, 1700, 1701,1704, 1705, 1706, 1707, 1708, 1709, 1710, 1711, 1944, 1946, 1948, 1950,1952, 1954, 1956, 1958, 1960, 1962, 1964, 1966, 1968, 1970, and 1972;(c) a nucleotide sequence encoding the polypeptide sequence of (a) withconservative substitutions as defined in Table 2; (d) a variant of thenucleotide sequences of (a) or (b), which is at least 80% identical to asequence of (a) or (b); (e) an orthologous sequence of the nucleotidesequences of (a) or (b), which is at least 80% identical to a sequenceof (a) or (b); (f) a paralogous sequence of the nucleotide sequences of(a) or (b), which is at least 80% identical to a sequence of (a) or (b);(g) a nucleotide sequence encoding a polypeptide comprising a conserveddomain that exhibits at least 80% sequence homology with the conserveddomain of the polypeptide of (a); and wherein said conserved domain of(a) is bounded by amino acid residue coordinates according to Tables 5aand Sb, and (h) a nucleic acid sequence that hybridizes to thepolynucleotide of (a) or (b) under stringent conditions.
 2. Thetransgenic plant of claim 1, wherein said stringent conditions are 6×SSCand 65° C.
 3. The transgenic plant of claim 2, wherein: the transgenicplant possesses an altered trait as compared to a non-transformed plant;or the transgenic plant exhibits an altered phenotype as compared tosaid non-transformed plant; or the transgenic plant expresses an alteredlevel of one or more genes associated with a plant trait as compared tosaid non-transformed plant; wherein said non-transformed plant does notoverexpress the recombinant polynucleotide.
 4. The transgenic plant ofclaim 3, wherein said polynucleotide sequence is derived from amonocotyledonous plant.
 5. The transgenic plant of claim 4, wherein saidtransgenic plant is dicotyledenous.
 6. The transgenic plant of claim 3,wherein said polynucleotide sequence is derived from a dicotyledonousplant.
 7. The transgenic plant of claim 6, wherein said transgenic plantis monocotyledonous.
 8. The transgenic plant of claim 3, wherein theplant is selected from the group consisting of: soybean, wheat, corn,potato, cotton, rice, oilseed rape, sunflower, alfalfa, clover,sugarcane, turf, banana, blackberry, blueberry, strawberry, raspberry,cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes,honeydew, lettuce, mango, melon, onion, papaya, peas, peppers,pineapple, pumpkin, spinach, squash, sweet corn, tobacco, tomato,watermelon, mint and other labiates, rosaceous fruits, and vegetablebrassicas.
 9. The transgenic plant of claim 3, further comprising aconstitutive, inducible, or tissue-specific promoter operably linked tosaid polynucleotide sequence or said complementary polynucleotidesequence.
 10. The transgenic plant of claim 3, wherein the encodedpolypeptide of (a)-(h) is expressed and regulates transcription of atleast one gene.
 11. The transgenic plant of claim 10, wherein theencoded polypeptide of (a)-(h) is SEQ ID NO:
 468. 12. The transgenicplant of claim 3, wherein said altered trait is an enhanced tolerance toabiotic stress.
 13. The transgenic plant of claim 12, wherein saidabiotic stress is increased tolerance to chilling and said nucleotidesequence is selected from the group consisting of SEQ ID NO: 183, 437,805, and polynucleotide variants thereof.
 14. The transgenic plant ofclaim 12, wherein said abiotic stress is increased germination in coldconditions and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 137, 183, 437, 805, 841, and polynucleotidevariants thereof.
 15. The transgenic plant of claim 12, wherein saidabiotic stress is increased freezing tolerance and said nucleotidesequence is selected from the group consisting of SEQ ID NO: 615 andpolynucleotide variants thereof.
 16. The transgenic plant of claim 12,wherein said abiotic stress is increased tolerance to heat and saidnucleotide sequence is selected from the group consisting of SEQ ID NO:291, 467, 573, 635, 819, and polynucleotide variants thereof.
 17. Thetransgenic plant of claim 12, wherein said abiotic stress is increasedtolerance to drought and said nucleotide sequence is selected from thegroup consisting of SEQ ID NO: 53, 615, 941, and polynucleotide variantsthereof.
 18. The transgenic plant of claim 12, wherein said abioticstress is increased tolerance to osmotic stress and said nucleotidesequence is selected from the group consisting of SEQ ID NO: 97, 223,309, 721, 731, 941, and polynucleotide variants thereof.
 19. Thetransgenic plant of claim 12, wherein said abiotic stress is increasedtolerance to salt and said nucleotide sequence is selected from thegroup consisting of SEQ ID NO: 27, 105, 141, 549, 579, 591, andpolynucleotide variants thereof.
 20. The transgenic plant of claim 12,wherein said abiotic stress is decreased sensitivity to nitrogenlimitation and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 139, 141, 467, 805, 939, 959, andpolynucleotide variants thereof.
 21. The transgenic plant of claim 12,wherein said abiotic stress is increased tolerance to phosphatelimitation and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 345 and polynucleotide variants thereof. 22.The transgenic plant of claim 12, wherein said abiotic stress isincreased tolerance to potassium limitation and said nucleotide sequenceis selected from the group consisting of SEQ ID NO: 269, 359, 613, andpolynucleotide variants thereof.
 23. The transgenic plant of claim 3,wherein said altered trait is altered hormone sensitivity.
 24. Thetransgenic plant of claim 23, wherein said altered hormone sensitivityis reduced sensitivity to abscisic acid and said nucleotide sequence isselected from the group consisting of SEQ ID NO: 721, 941, andpolynucleotide variants thereof.
 25. The transgenic plant of claim 23,wherein said altered hormone sensitivity is an altered response toethylene and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 507, 713, 741, and polynucleotide variantsthereof.
 26. The transgenic plant of claim 3, wherein said altered traitis disease resistance.
 27. The transgenic plant of claim 26, whereinsaid altered trait is altered susceptibility to Botrytis and saidnucleotide sequence is selected from the group consisting of SEQ ID NO:37, 117, 171, 191, 575, 587, 793, and polynucleotide variants thereof.28. The transgenic plant of claim 26, wherein said altered trait isaltered susceptibility to Fusarium and said nucleotide sequence isselected from the group consisting of SEQ ID NO: 97, 345, 595, andpolynucleotide variants thereof.
 29. The transgenic plant of claim 26,wherein said altered trait is altered susceptibility to Erysiphe andsaid nucleotide sequence is selected from the group consisting of SEQ IDNO: 21, 37, 159, 241, 261, 345, 389, 401, 575, 581, 587, 695, 799, andpolynucleotide variants thereof.
 30. The transgenic plant of claim 26,wherein said altered trait is altered susceptibility to Pseudomonassyringae and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 267, 337, 345, and polynucleotide variantsthereof.
 31. The transgenic plant of claim 26, wherein said alteredtrait is altered susceptibility to Sclerotinia and said nucleotidesequence is selected from the group consisting of SEQ ID NO: 37, 303,539, 553, and polynucleotide variants thereof.
 32. The transgenic plantof claim 3, wherein said altered trait is altered sugar sensing and saidnucleotide sequence is selected from the group consisting of SEQ ID NO:33, 41, 49, 117, 137, 163, 179, 193, 217, 343, 369, 463, 529, 531, 579,605, 615, 659, 719, 831, 861, 911, 939, and polynucleotide variantsthereof.
 33. The transgenic plant of claim 32, wherein said alteredsugar sensing is increased tolerance to sugars and said nucleotidesequence is selected from the group consisting of SEQ ID NO: 137, 529,531, 579, 605, 911, 939, and polynucleotide variants thereof.
 34. Thetransgenic plant of claim 32, wherein said altered sugar sensing confersimproved seed germination.
 35. The transgenic plant of claim 32, whereinsaid altered sugar sensing confers improved seedling vigor.
 36. Thetransgenic plant of claim 3, wherein said altered trait is earlyflowering and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 57, 59, 61, 69, 87, 89, 91, 119, 143, 181, 249,301, 387, 391,405, 539, 575, 685, 787, 819, 867, 921, 937,941, 943, 945,947, 951, 957, 1952, 1954, 1956, 1958, 1960, 1962, 1964, 1966, 1968,1970, 1972, and polynucleotide variants thereof.
 37. The transgenicplant of claim 3, wherein said altered trait is late flowering and saidnucleotide sequence is selected from the group consisting of SEQ ID NO:11, 69, 101, 109, 127, 157, 173, 237, 275, 307, 361, 375, 389, 463, 487,489, 497, 503, 567, 585, 603, 615; 639, 657, 699, 723, 745, 859, 887,893, 901, 911, 1944, 1946, 1948, 1950, 1970, 1972, and polynucleotidevariants thereof.
 38. The transgenic plant of claim 3, wherein saidaltered trait is an extended period of flowering and said nucleotidesequence is selected from the group consisting of SEQ ID NO: 949 andpolynucleotide variants thereof.
 39. The transgenic plant of claim 3,wherein said altered trait is altered flower structure and saidnucleotide sequence is selected from the group consisting of SEQ ID NO:55, 95, 295, 357, 381, 399, 419, 483, 449, 525, 581, 647, 717, 725, 741,799, 827, 849, 883, 891, 913, 925, 929, 949, polynucleotide variantsthereof.
 40. The transgenic plant of claim 3, wherein said altered traitis an inflorescence architectural change and said nucleotide sequence isselected from the group consisting of SEQ ID NO: 59, 245, 381, 575, 805,883, 901, 913, and polynucleotide variants thereof.
 41. The transgenicplant of claim 3, wherein said altered trait is a change in stembifurcations and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 249 and polynucleotide variants thereof. 42.The transgenic plant of claim 3, wherein said altered trait is a lack ofa shoot meristem and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 899 and polynucleotide variants thereof. 43.The transgenic plant of claim 3, wherein said altered trait is reducedmeristem cell differentiation and said nucleotide sequence is selectedfrom the group consisting of SEQ ID NO: 919 and polynucleotide variantsthereof.
 44. The transgenic plant of claim 3, wherein said altered traitis altered phyllotaxy and said nucleotide sequence is selected from thegroup consisting of SEQ ID NO: 419 and polynucleotide variants thereof.45. The transgenic plant of claim 3, wherein said altered trait is analtered branching pattern and said nucleotide sequence is selected fromthe group consisting of SEQ ID NO: 283, 913, and polynucleotide variantsthereof.
 46. The transgenic plant of claim 3, wherein said altered traitis reduced apical dominance and said nucleotide sequence is selectedfrom the group consisting of SEQ ID NO: 357, 483, 793, 807, 929, andpolynucleotide variants thereof.
 47. The transgenic plant of claim 3,wherein said altered trait is reduced trichome density or lack oftrichomes and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 31, 123, 139, 141, 457, 467, 855, 939, 959, andpolynucleotide variants thereof.
 48. The transgenic plant of claim 3,wherein said altered trait is ectopic trichome development or alteredtrichome development and said nucleotide sequence is selected from thegroup consisting of SEQ ID NO: 169 and polynucleotide variants thereof.49. The transgenic plant of claim 3, wherein said altered trait is anincrease in trichome number and said nucleotide sequence is selectedfrom the group consisting of SEQ ID NO: 415 and polynucleotide variantsthereof.
 50. The transgenic plant of claim 3, wherein said altered traitis altered stem morphology and said nucleotide sequence is selected fromthe group consisting of SEQ ID NO: 283, 497, and polynucleotide variantsthereof.
 51. The transgenic plant of claim 3, wherein said altered traitis increased root growth and said nucleotide sequence is selected fromthe group consisting of SEQ ID NO: 13, 905, and polynucleotide variantsthereof.
 52. The transgenic plant of claim 3, wherein said altered traitis increased root hairs and said nucleotide sequence is selected fromthe group consisting of SEQ ID NO: 139, 141, 467, 939, 959, andpolynucleotide variants thereof.
 53. The transgenic plant of claim 3,wherein said altered trait is altered seed development and saidnucleotide sequence is selected from the group consisting of SEQ ID NO:649 and polynucleotide variants thereof.
 54. The transgenic plant ofclaim 3, wherein said altered trait is altered cell proliferation orcell differentiation and said nucleotide sequence is selected from thegroup consisting of SEQ ID NO: 919 and polynucleotide variants thereof.55. The transgenic plant of claim 3, wherein said altered trait is slowgrowth and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 285, 491, 713, 859, 901, and polynucleotidevariants thereof.
 56. The transgenic plant of claim 3, wherein saidaltered trait is premature senescence and said nucleotide sequence isselected from the group consisting of SEQ ID NO: 417, 737, andpolynucleotide variants thereof.
 57. The transgenic plant of claim 3,wherein said altered trait is delayed senescence and said nucleotidesequence is selected from the group consisting of SEQ ID NO: 173, 375,585, 697, and polynucleotide variants thereof.
 58. The transgenic plantof claim 3, wherein said altered trait is lethality when overexpressedand said nucleotide sequence is selected from the group consisting ofSEQ ID NO: 239, 379, 583, 727, 817, and polynucleotide variants thereof.59. The transgenic plant of claim 3, wherein said altered trait isincreased necrosis and said nucleotide sequence is selected from thegroup consisting of SEQ ID NO: 29 and polynucleotide variants thereof.60. The transgenic plant of claim 3, wherein said altered trait is anincrease in seedling or plant size and said nucleotide sequence isselected from the group consisting of SEQ ID NO: 53, 723, 829, 887, 893,and polynucleotide variants thereof.
 61. The transgenic plant of claim3, wherein said altered trait is decreased plant size and saidnucleotide sequence is selected from the group consisting of SEQ ID NO:1, 5, 25, 29, 35, 89, 103, 185, 203, 245, 285, 417,447, 449, 489, 507,557, 573, 581, 591, 627, 647, 655, 657, 673, 677, 701, 717, 725, 753,775, 799, 801, 807, 809, 825, 827, 831, 837, 841, 843, 849, 855, 857,869, 901, 921,925, and polynucleotide variants thereof.
 62. Thetransgenic plant of claim 3, wherein said altered trait is a change inleaf morphology.
 63. The transgenic plant of claim 62, wherein saidchange in leaf morphology is dark green leaves and said nucleotidesequence is selected from the group consisting of SEQ ID NO: 245, 285,615, 627, 647, 737, 801, 843, 851, 857, 913, and polynucleotide variantsthereof.
 64. The transgenic plant of claim 62, wherein said change inleaf morphology is altered leaf shape and said nucleotide sequence isselected from the group consisting of SEQ ID NO: 39, 137, 271, 291, 407,449, 487, 603, 605, 619, 627, 647, 687, 717, 723, 725, 803, 911, 929,and polynucleotide variants thereof.
 65. The transgenic plant of claim62, wherein said change in leaf morphology is increased altered leafdevelopment, and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 929, and polynucleotide variants thereof. 66.The transgenic plant of claim 62, wherein said change in leaf morphologyis increased leaf size and mass, and said nucleotide sequence isselected from the group consisting of SEQ ID NO: 283, 805, 893, andpolynucleotide variants thereof.
 67. The transgenic plant of claim 62,wherein said change in leaf morphology is glossy leaves and saidnucleotide sequence is selected from the group consisting of SEQ ID NO:643, 801, and polynucleotide variants thereof.
 68. The transgenic plantof claim 60, wherein said change in leaf morphology is leaf cellexpansion and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 331 and polynucleotide variants thereof. 69.The transgenic plant of claim 3, wherein said altered trait is a changein seed morphology.
 70. The transgenic plant of claim 69, wherein saidchange in seed morphology is altered seed coloration and said nucleotidesequence is selected from the group consisting of SEQ ID NO: 67, 435,443, and polynucleotide variants thereof.
 71. The transgenic plant ofclaim 69, wherein said change in seed morphology is increased seed sizeand said nucleotide sequence is selected from the group consisting ofSEQ ID NO: 115, 385, 793, and polynucleotide variants thereof.
 72. Thetransgenic plant of claim 69, wherein said change in seed morphology isdecreased seed size and said nucleotide sequence is SEQ ID NO: 749 andpolynucleotide variants thereof.
 73. The transgenic plant of claim 69,wherein said change in seed morphology is altered seed shape and saidnucleotide sequence is selected from the group consisting of SEQ ID NO:713, 749, and polynucleotide variants thereof.
 74. The transgenic plantof claim 3, wherein said altered trait is a change in leaf biochemistry.75. The transgenic plant of claim 74, wherein said change in leafbiochemistry is increased leaf wax and said nucleotide sequence isselected from the group consisting of SEQ ID NO: 643, 801, andpolynucleotide variants thereof.
 76. The transgenic plant of claim 74,wherein said change in leaf biochemistry is an alteration in leaf prenyllipid content and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 127, 203, 653, 739, 845, 853, andpolynucleotide variants thereof.
 77. The transgenic plant of claim 74,wherein said change in leaf biochemistry is increased leaf insolublesugars and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 121, 159, 165, 199, 215, 271, 277, 337, 393,521, 581, 825, and polynucleotide variants thereof.
 78. The transgenicplant of claim 74, wherein said change in leaf biochemistry is decreasedleaf insoluble sugars and said nucleotide sequence is selected from thegroup consisting of SEQ ID NO: 215, 271, 671, and polynucleotidevariants thereof.
 79. The transgenic plant of claim 74, wherein saidchange in leaf biochemistry is increased leaf anthocyanins and saidnucleotide sequence is selected from the group consisting of SEQ ID NO:435, and polynucleotide variants thereof.
 80. The transgenic plant ofclaim 74, wherein said change in leaf biochemistry is an alteration ofleaf fatty acid content and said nucleotide sequence is selected fromthe group consisting of SEQ ID NO: 127, 151, 473, 799, 861, andpolynucleotide variants thereof.
 81. The transgenic plant of claim 74,wherein said change in leaf biochemistry is an alteration of leafglucosinolate content and said nucleotide sequence is selected from thegroup consisting of SEQ ID NO: 91, 465, 721, 763, 841, andpolynucleotide variants thereof.
 82. The transgenic plant of claim 3,wherein said altered trait is a change in seed biochemistry.
 83. Thetransgenic plant of claim 82, wherein said change in seed biochemistryis an increase in seed oil content and said nucleotide sequence isselected from the group consisting of SEQ ID NO: 71, 147, 151, 209, 287,291, 359, 387, 393, 483, 565, 633, 759, 763, and polynucleotide variantsthereof.
 84. The transgenic plant of claim 82, wherein said change inseed biochemistry is an decrease in seed oil content and said nucleotidesequence is selected from the group consisting of SEQ ID NO: 87, 101,111, 135, 163, 435, 443, 473, 483, 521, 613, 843, 941, andpolynucleotide variants thereof.
 85. The transgenic plant of claim 82,wherein said change in seed biochemistry is an increase in seed fattyacid content and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 523, 571, 581, 631, 637, 877, 881, andpolynucleotide variants thereof.
 86. The transgenic plant of claim 82,wherein said change in seed biochemistry is an decrease in seed fattyacid content and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 519, 541, 631, 653, 881, and polynucleotidevariants thereof.
 87. The transgenic plant of claim 82, wherein saidchange in seed biochemistry is an increase in seed protein content andsaid nucleotide sequence is selected from the group consisting of SEQ IDNO: 111, 135, 141, 163, 407, 409, 435, 443, 473, 483, 575, 613, 695,843, 891, 941, and polynucleotide variants thereof.
 88. The transgenicplant of claim 82, wherein said change in seed biochemistry is adecrease in seed protein content and said nucleotide sequence isselected from the group consisting of SEQ ID NO: 147, 151, 267, 287,291, 483, 927, and polynucleotide variants thereof.
 89. The transgenicplant of claim 82, wherein said change in seed biochemistry is analteration in seed prenyl lipid content and said nucleotide sequence isselected from the group consisting of SEQ ID NO: 127, 473, 497, 589,699, and polynucleotide variants thereof.
 90. The transgenic plant ofclaim 82, wherein said change in seed biochemistry is an increase inseed sterols and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 23 and polynucleotide variants thereof.
 91. Thetransgenic plant of claim 82, wherein said change in seed biochemistryis an upregulation of genes involved in secondary metabolism, and saidnucleotide sequence is selected from the group consisting of SEQ ID NO:147 and polynucleotide variants thereof.
 92. The transgenic plant ofclaim 3, wherein said altered trait is an increase in root anthocyaninsand said nucleotide sequence is selected from the group consisting ofSEQ ID NO: 435 and polynucleotide variants thereof.
 93. The transgenicplant of claim 3, wherein said altered trait is an increase in plantanthocyanins and said nucleotide sequence is selected from the groupconsisting of SEQ ID NO: 435, 905, and polynucleotide variants thereof.94. The transgenic plant of claim 3, wherein said altered trait is analteration in light response or shade avoidance, and said nucleotidesequence is selected from the group consisting of SEQ ID NO: 235, 713,841, and polynucleotide variants thereof.
 95. A method of using atransgenic plant of claim 1 to grow a progeny plant, the methodcomprising: (a) crossing the transgenic plant with itself or anotherplant; (b) selecting seed that develops as a result of said crossing;and (c) growing the progeny plant from the seed.
 96. The method of claim95, wherein: said progeny plant expresses mRNA that encodes aDNA-binding protein that binds to a DNA regulatory sequence and inducesexpression of a plant trait gene; said mRNA is expressed at a levelgreater than a non-transformed plant; and said progeny plant ischaracterized by a change in a plant trait compared to saidnon-transformed plant; wherein said non-transformed plant does notcomprise the recombinant polynucleotide.
 97. An expression cassettecomprising: (1) a constitutive, inducible, or tissue-specific promoter;and (2) a recombinant polynucleotide having a polynucleotide sequence,or a complementary polynucleotide sequence thereof, selected from thegroup consisting of: ((a) a nucleotide sequence encoding a polypeptidethat initiates transcription, wherein said polypeptide is selected fromthe group consisting of SEQ ID NO: 2N, wherein N=1-480, SEQ ID NO: 2N-1,where N=857-970, or SEQ ID NO: 989, 990, 991, 1001, 1002, 1012, 1018,1021, 1022, 1025, 1027, 1028, 1029, 1034, 1050, 1051, 1072, 1073, 1074,1075, 1076, 1091, 1092, 1093, 1094, 1095, 1109, 1110, 1111, 1112, 1150,1165, 1166, 1167, 1168, 1169, 1189, 1190, 1191, 1197, 1198, 1199, 1213,1214, 1215, 1216, 1226, 1227, 1233, 1239, 1246, 1247, 1258, 1259, 1269,1307, 1308, 1309, 1310, 1323, 1329, 1330, 1331, 1332, 1338, 1339, 1340,1361, 1362, 1373, 1374, 1375, 1384, 1389, 1390, 1391, 1396, 1411, 1412,1413, 1414, 1424, 1435, 1436, 1437, 1448, 1456, 1457, 1458, 1459, 1460,1472, 1483, 1484, 1500, 1508, 1510, 1511, 1520, 1538, 1539, 1540, 1541,1542, 1543, 1563, 1564, 1565, 1566, 1567, 1568, 1569, 1582, 1583, 1594,1611, 1612, 1618, 1619, 1620, 1626, 1627, 1635, 1636, 1640, 1641, 1655,1656, 1657, 1658, 1672, 1673, 1680, 1682, 1686, 1687,1688, 1689, 1696,1702, 1703, 1945, 1947, 1949, 1951, 1953, 1955, 1957, 1959, 1961, 1963,1965, 1967, 1969, 1971, and 1973; (b) a nucleotide sequence encoding apolypeptide that initiates transcription, wherein said nucleotidesequence is selected from the group consisting of SEQ ID NO: 2N-1, whereN=1-480, SEQ ID NO: 2N, where N=856-969, or SEQ ID NO: 961, 962, 963,964, 965, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978,979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 992, 993, 994, 995,996, 997, 998, 999, 1000, 1003, 1004, 1005, 1006, 1007, 1008, 1009,1010, 1011, 1013, 1014, 1015, 1016, 1017, 1019, 1020, 1023, 1024, 1026,1030, 1031, 1032, 1033, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1042,1043, 1044, 1045, 1046, 1047, 1048, 1049, 1052, 1053, 1054, 1055, 1056,1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068,1069, 1070, 1071, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085,1086, 1087, 1088, 1089, 1090, 1096, 1097, 1098, 1099, 1100, 1101, 1102,1103, 1104, 1105, 1106, 1107, 1108, 1113, 1114, 1115, 1116, 1117, 1118,1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130,1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, 1141, 1142,1143, 1144, 1145, 1146, 1147, 1148, 1149, 1151, 1152, 1153, 1154, 1155,1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1170, 1171, 1172,1173, 1174, 1175, 1176, 1177, 1178, 1179, 1180, 1181, 1182, 1183, 1184,1185, 1186, 1187, 1188, 1192, 1193, 1194, 1195, 1196, 1200, 1201, 1202,1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211, 1212, 1217, 1218,1219, 1220, 1221, 1222, 1223,1224, 1225, 1228, 1229, 1230, 1231, 1232,1234, 1235, 1236, 1237, 1238, 1240, 1241, 1242, 1243, 1244, 1245, 1248,1249, 1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1260, 1261, 1262,1263, 1264, 1265, 1266, 1267, 1268, 1270, 1271, 1272, 1273, 1274, 1275,1276, 1277, 1278, 1279, 1280, 1281, 1282, 1283, 1284, 1285, 1286, 1287,1288, 1289, 1290, 1291, 1292, 1293, 1294, 1295, 1296, 1297, 1298, 1299,1300, 1301, 1302, 1303, 1304, 1305, 1306, 1311, 1312, 1313, 1314, 1315,1316, 1317, 1318, 1319, 1320, 1321, 1322, 1324, 1325, 1326, 1327, 1328,1333, 1334, 1335, 1336, 1337, 1341, 1342, 1343, 1344, 1345, 1346, 1347,1348, 1349, 1350, 1351, 1352, 1353, 1354, 1355, 1356, 1357, 1358, 1359,1360, 1363, 1364, 1365, 1366, 1367, 1368, 1369, 1370, 1371, 1372, 1376,1377, 1378, 1379, 1380, 1381, 1382, 1383, 1385, 1386, 1387, 1388, 1392,1393, 1394, 1395, 1397, 1398, 1399, 1400, 1401, 1402, 1403, 1404, 1405,1406, 1407, 1408, 1409, 1410, 1415, 1416, 1417, 1418, 1419, 1420, 1421,1422, 1423, 1425, 1426, 1427, 1428, 1429, 1430, 1431, 1432, 1433, 1434,1438, 1439, 1440, 1441, 1442, 1443, 1444, 1445, 1446, 1447,1449, 1450,1451, 1452, 1453, 1454, 1455, 1461, 1462, 1463, 1464, 1465, 1466, 1467,1468, 1469, 1470, 1471, 1473, 1474, 1475, 1476, 1477, 1478, 1479, 1480,1481, 1482, 1485, 1486, 1487, 1488, 1489, 1490, 1491, 1492, 1493, 1494,1495, 1496, 1497, 1498, 1499, 1501, 1502, 1503, 1504, 1505, 1506, 1507,1509, 1512, 1513, 1514, 1515, 1516, 1517, 1518, 1519, 1521, 1522, 1523,1524, 1525, 1526, 1527, 1528, 1529, 1530, 1531, 1532, 1533, 1534, 1535,1536, 1537, 1544, 1545, 1546, 1547, 1548, 1549, 1550, 1551, 1552, 1553,1554, 1555, 1556, 1557, 1558, 1559, 1560, 1561, 1562, 1570, 1571, 1572,1573, 1574, 1575, 1576, 1577, 1578, 1579, 1580, 1581, 1584, 1585, 1586,1587, 1588, 1589, 1590, 1591, 1592, 1593, 1595, 1596, 1597, 1598, 1599,1600, 1601, 1602,1603, 1604, 1605, 1606, 1607, 1608, 1609, 1610,1613,1614, 1615, 1616, 1617, 1621, 1622, 1623, 1624, 1625, 1628, 1629,1630, 1631, 1632, 1633,1634, 1637, 1638, 1639, 1642, 1643, 1644,1645,1646, 1647, 1648, 1649, 1650, 1651, 1652, 1653, 1654, 1659, 1660, 1661,1662, 1663, 1664, 1665, 1666, 1667, 1668, 1669, 1670, 1671, 1674, 1675,1676, 1677, 1678, 1679, 1681, 1683, 1684, 1685, 1690, 1691, 1692, 1693,1694, 1695, 1697, 1698, 1699, 1700, 1701, 1704, 1705, 1706, 1707, 1708,1709, 1710, 1711, 1944, 1946, 1948, 1950, 1952, 1954, 1956, 1958, 1960,1962, 1964, 1966, 1968, 1970, and 1972; (c) a nucleotide sequenceencoding the polypeptide sequence of (a) with conservative substitutionsas defined in Table 2; (d) a variant of the nucleotide sequences of (a)or (b), which is at least 80% identical to a sequence of (a) or (b); (e)an orthologous sequence of the nucleotide sequences of (a) or (b), whichis at least 80% identical to a sequence of (a) or (b); (f) a paralogoussequence of the nucleotide sequences of (a) or (b), which is at least80% identical to a sequence of (a) or (b); (g) a nucleotide sequenceencoding a polypeptide comprising a conserved domain that exhibits atleast 80% sequence homology with the conserved domain of the polypeptideof (a); and wherein said conserved domain of (a) is bounded by aminoacid residue coordinates according to Tables 5a and 5b, and (h) anucleic acid sequence that hybridizes to the polynucleotide of (a) or(b) under stringent conditions; wherein said recombinant polynucleotideis operably linked to said promoter.
 98. The expression cassette ofclaim 97, wherein said stringent conditions are of 6×SSC and 650 C. 99.The expression cassette of claim 98, wherein the encoded polypeptide of(a)-(h) is SEQ ID NO:
 468. 100. A host cell comprising the expressioncassette of claim
 97. 101. A method for producing a modified planthaving a modified trait, the method comprising: (a) selecting apolynucleotide that encodes a polypeptide, wherein said polynucleotidehas a sequence, or a complementary sequence thereof, selected from thegroup consisting of: (i) a nucleotide sequence encoding a polypeptidethat initiates transcription, wherein said polypeptide is selected fromthe group consisting of SEQ ID NO: 2N, wherein N=1-480, SEQ ID NO: 2N-1,where N=857-970, or SEQ ID NO: 989, 990, 991, 1001, 1002, 1012, 1018,1021, 1022, 1025, 1027, 1028, 1029, 1034, 1050, 1051, 1072,1073, 1074,1075, 1076, 1091, 1092, 1093, 1094, 1095, 1109,1110, 1111, 1112, 1150,1165, 1166, 1167, 1168, 1169, 1189,1190, 1191, 1197, 1198,1199,1213,1214,1215,1216, 1226,1227,1233,1239, 1246, 1247, 1258, 1259, 1269,1307, 1308, 1309, 1310, 1323, 1329, 1330, 1331, 1332, 1338, 1339, 1340,1361, 1362, 1373, 1374, 1375, 1384, 1389, 1390, 1391, 1396, 1411, 1412,1413, 1414, 1424, 1435, 1436, 1437, 1448, 1456, 1457, 1458, 1459, 1460,1472, 1483, 1484, 1500, 1508, 1510, 1511, 1520, 1538, 1539, 1540, 1541,1542, 1543, 1563, 1564, 1565, 1566, 1567, 1568, 1569, 1582, 1583, 1594,1611, 1612, 1618, 1619, 1620, 1626, 1627, 1635, 1636,1640, 1641, 1655,1656, 1657, 1658, 1672, 1673, 1680, 1682, 1686, 1687, 1688, 1689, 1696,1702, 1703, 1945, 1947, 1949, 1951, 1953, 1955, 1957, 1959, 1961, 1963,1965, 1967, 1969, 1971, and 1973; (ii) a nucleotide sequence encoding apolypeptide that initiates transcription, wherein said nucleotidesequence is selected from the group consisting of SEQ ID NO: 2N-1, whereN=1-480, SEQ ID NO: 2N, where N=856-969, or SEQ ID NO: 961, 962, 963,964, 965, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978,979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 992, 993, 994, 995,996, 997, 998, 999, 1000, 1003, 1004, 1005, 1006, 1007, 1008, 1009,1010, 1011, 1013, 1014, 1015, 1016, 1017, 1019, 1020, 1023, 1024, 1026,1030, 1031, 1032, 1033, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1042,1043, 1044, 1045, 1046, 1047, 1048, 1049, 1052, 1053, 1054, 1055, 1056,1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068,1069, 1070, 1071, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085,1086, 1087, 1088, 1089, 1090, 1096, 1097, 1098, 1099, 1100, 1101, 1102,1103, 1104, 1105, 1106, 1107, 1108, 1113, 1114, 1115, 1116, 1117, 1118,1119, 1120, 1121, 1122, 1123, 1124, 1125,1126, 1127, 1128,1129,1130,1131, 1132,1133, 1134,1135,1136, 1137, 1138, 1139, 1140, 1141, 1142,1143, 1144, 1145, 1146, 1147, 1148, 1149, 1151, 1152, 1153, 1154, 1155,1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1170, 1171, 1172,1173, 1174, 1175, 1176, 1177, 1178, 1179, 1180, 1181, 1182, 1183, 1184,1185, 1186, 1187, 1188, 1192, 1193, 1194, 1195, 1196, 1200, 1201, 1202,1203, 1204, 1205, 1206,1207,1208, 1209, 1210, 1211, 1212, 1217, 1218,1219, 1220, 1221, 1222, 1223, 1224, 1225, 1228, 1229, 1230, 1231, 1232,1234, 1235, 1236, 1237, 1238, 1240, 1241, 1242, 1243, 1244, 1245, 1248,1249, 1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1260, 1261, 1262,1263, 1264, 1265, 1266, 1267, 1268, 1270, 1271, 1272, 1273, 1274, 1275,1276, 1277, 1278, 1279, 1280, 1281, 1282, 1283, 1284,1285, 1286, 1287,1288, 1289, 1290, 1291, 1292, 1293, 1294, 1295, 1296, 1297, 1298, 1299,1300, 1301, 1302, 1303, 1304, 1305, 1306, 1311, 1312, 1313, 1314, 1315,1316, 1317, 1318, 1319, 1320, 1321, 1322, 1324, 1325, 1326, 1327, 1328,1333, 1334, 1335, 1336, 1337, 1341, 1342, 1343, 1344, 1345, 1346, 1347,1348, 1349, 1350, 1351, 1352, 1353, 1354, 1355, 1356, 1357, 1358, 1359,1360, 1363, 1364, 1365, 1366, 1367, 1368, 1369, 1370, 1371, 1372, 1376,1377, 1378, 1379, 1380, 1381, 1382, 1383, 1385, 1386, 1387, 1388, 1392,1393, 1394, 1395, 1397, 1398, 1399, 1400, 1401, 1402, 1403, 1404, 1405,1406, 1407, 1408, 1409, 1410, 1415, 1416, 1417, 1418, 1419, 1420, 1421,1422, 1423, 1425, 1426, 1427, 1428, 1429, 1430, 1431, 1432, 1433, 1434,1438, 1439, 1440, 1441, 1442, 1443, 1444, 1445, 1446, 1447, 1449, 1450,1451, 1452, 1453, 1454, 1455, 1461, 1462, 1463, 1464, 1465, 1466, 1467,1468, 1469, 1470, 1471, 1473, 1474, 1475, 1476, 1477, 1478, 1479, 1480,1481, 1482, 1485, 1486, 1487, 1488, 1489, 1490, 1491, 1492, 1493, 1494,1495, 1496, 1497, 1498, 1499, 1501, 1502, 1503, 1504, 1505,1506, 1507,1509, 1512, 1513, 1514, 1515, 1516, 1517, 1518, 1519, 1521, 1522, 1523,1524, 1525, 1526, 1527, 1528, 1529, 1530, 1531, 1532, 1533, 1534, 1535,1536, 1537, 1544, 1545, 1546, 1547, 1548, 1549, 1550, 1551,1552, 1553,1554, 1555, 1556, 1557, 1558, 1559, 1560, 1561,1562, 1570, 1571, 1572,1573, 1574, 1575, 1576, 1577, 1578, 1579, 1580, 1581, 1584, 1585, 1586,1587, 1588, 1589, 1590, 1591, 1592, 1593, 1595, 1596, 1597, 1598, 1599,1600, 1601, 1602,1603, 1604, 1605, 1606, 1607, 1608, 1609, 1610, 1613,1614, 1615, 1616, 1617, 1621, 1622, 1623, 1624, 1625, 1628, 1629, 1630,1631, 1632, 1633, 1634, 1637, 1638, 1639, 1642, 1643, 1644, 1645, 1646,1647, 1648, 1649, 1650, 1651, 1652, 1653, 1654, 1659, 1660, 1661, 1662,1663, 1664, 1665, 1666, 1667, 1668, 1669, 1670, 1671, 1674, 1675, 1676,1677, 1678, 1679, 1681, 1683, 1684, 1685, 1690, 1691, 1692, 1693, 1694,1695, 1697, 1698, 1699, 1700, 1701, 1704, 1705, 1706, 1707, 1708, 1709,1710, 1711, 1944, 1946, 1948, 1950, 1952, 1954, 1956, 1958, 1960, 1962,1964, 1966, 1968, 1970, and 1972; (iii) a nucleotide sequence encodingthe polypeptide sequence of (i) with conservative substitutions asdefined in Table 2; (iv) a variant of the nucleotide sequences of (a) or(b), which is at least 80% identical to a sequence of (i) or (ii); (v)an orthologous sequence of the nucleotide sequences of (i) or (ii),which is at least 80% identical to a sequence of (i) or (ii); (vi) aparalogous sequence of the nucleotide sequences of (i) or (ii), which isat least 80% identical to a sequence of (i) or (ii); (vii) a nucleotidesequence encoding a polypeptide comprising a conserved domain thatexhibits at least 80% sequence homology with the conserved domain of thepolypeptide of (i); and wherein said conserved domain of (i) is boundedby amino acid residue coordinates according to Tables 5a and 5b, and(viii) a nucleic acid sequence that hybridizes to the polynucleotide of(i) or (ii) under stringent conditions; (b) inserting the polynucleotideinto an expression cassette according to claim 97; (c) introducing theexpression cassette into a plant or a cell of a plant to overexpress thepolypeptide, thereby producing said modified plant; and (d) selectingsaid modified plant having said modified trait.
 102. The method of claim101, wherein said stringent conditions are of 6×SSC and 65° C.
 103. Themethod of claim 102, wherein the encoded polypeptide of (i)-(viii) isSEQ ID NO:
 468. 104. The method of claim 101, wherein: the transgenicplant possesses an altered trait as compared to a non-transformed plant;or the transgenic plant exhibits an altered phenotype as compared tosaid non-transformed plant; or the transgenic plant expresses an alteredlevel of one or more genes associated with a plant trait as compared tosaid non-transformed plant; wherein said non-transformed plant does notcomprise the recombinant polynucleotide.
 105. The method of claim 101,wherein the plant is selected from the group consisting of: soybean,wheat, corn, potato, cotton, rice, oilseed rape, sunflower, alfalfa,clover, sugarcane, turf, banana, blackberry, blueberry, strawberry,raspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant,grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers,pineapple, pumpkin, spinach, squash, sweet corn, tobacco, tomato,watermelon, mint and other labiates, rosaceous fruits, and vegetablebrassicas.
 106. The method of claim 101, wherein the encoded polypeptideis expressed and regulates transcription of a gene.
 107. The method ofclaim 101, further comprising a constitutive, inducible, ortissue-specific promoter operably linked to said polynucleotide sequenceor said complementary polynucleotide sequence.
 108. The method of claim101, wherein said altered trait is a trait selected from the groupconsisting of enhanced tolerance to abiotic stress, enhanced toleranceto glyphosate, altered hormone sensitivity, altered disease resistance,altered sugar sensing, earlier flowering, later flowering, alteredflower structure, altered inflorescence architecture, altered shootmeristem development, altered branching pattern, reduced apicaldominance, altered trichome density, altered stem morphology, increasedroot growth, increased root hairs, altered seed development, alteredseed germination, altered cell differentiation, altered cellproliferation, rapid plant development, premature senescence, lethalitywhen overexpressed, increased necrosis, increased plant size, largerseedlings, more compact plants, dark green leaves, leaf shape, lightgreen leaves, variegation, glossy leaves, seed coloration, increasedseed size, decreased seed size, altered seed shape, increased leaf wax,altered leaf prenyl lipid content, increased leaf insoluble sugars,increased leaf anthocyanins, altered leaf fatty acid content, increasedseed oil, decreased seed oil, altered seed fatty acid content, increasedseed protein, decreased seed protein, altered seed prenyl lipid content,increased seed anthocyanins, increased root anthocyanins, altered lightresponse, altered shade avoidance, or increased plant anthocyanin level.109 A method of identifying a factor that is modulated by or interactswith a polypeptide encoded by the polynucleotide sequence of claim 1,said method comprising: expressing a polypeptide encoded by saidpolynucleotide sequence of claim 1 in a plant; and identifying at leastone factor that is modulated by or interacts with said polypeptide. 110.A method for identifying at least one downstream polynucleotide sequencethat is subject to a regulatory effect of any of the polypeptides ofclaim 1, said method comprising: expressing any of the polypeptides ofclaim 1 in a plant cell; and identifying RNA or protein produced as aresult of said expression.
 111. The method of claim 110, wherein saididentifying is by Northern analysis, RT-PCR, microarray gene expressionassays, reporter gene expression systems subtractive hybridization,differential display, representational differential analysis, or bytwo-dimensional gel electrophoresis of one or more protein products.